ChamaraVishwajithRajapaksha/RAG-Evaluation-Dataset

Name: ChamaraVishwajithRajapaksha/RAG-Evaluation-Dataset
Creator: ChamaraVishwajithRajapaksha
Published: 2025-10-23 04:54:36
License: 暂无描述

Hugging Face2025-10-23 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/ChamaraVishwajithRajapaksha/RAG-Evaluation-Dataset

下载链接

链接失效反馈

官方服务：

资源简介：

RAG评估数据集包含333个文本块及其对应的嵌入，这些文本块来自WSO2 AI会议和与AI相关的学术论文。该数据集旨在用于检索增强生成（RAG）评估、语义搜索和信息检索任务。每个文本块都包含丰富的元数据，并使用OpenAI的text-embedding-3-small模型生成1536维向量嵌入。README还提供了如何加载数据集、转换嵌入以及使用向量数据库和进行语义搜索的示例。它还提到了一些局限性，例如需要将嵌入转换为数组、使用专有模型、领域特定性、时效性、偏差、语言限制以及需要遵守许可协议。

The RAG Evaluation Dataset consists of 333 text chunks with corresponding embeddings, derived from WSO2 AI sessions and AI research papers. It is designed for tasks such as Retrieval-Augmented Generation (RAG) evaluation, semantic search, and information retrieval. Each text chunk includes comprehensive metadata, and embeddings are generated using OpenAIs text-embedding-3-small model. The README provides examples of how to load the dataset, convert embeddings, and use it with vector databases and for semantic search. It also mentions limitations such as the need to convert embeddings to arrays, the use of a proprietary model, domain specificity, temporal currency, bias, language limitation, and the need for license compliance.

提供机构：

ChamaraVishwajithRajapaksha

5,000+

优质数据集

54 个

任务类型

进入经典数据集