Scicom-intl/Malaysian-RAG-Dataset
收藏Hugging Face2025-12-17 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Scicom-intl/Malaysian-RAG-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为Malaysian RAG Dataset,包含与上下文文档和评估指标相关的问答对。每个条目包括源文档、由ChatGPT-5.1生成的问题和答案,以及使用RAGAS评估的上下文精确度和忠实度质量分数。数据来源于多个马来西亚相关的网站和数据集,包括maktabahalbakri.com、muftiwp.gov.my.dedup、asklegal、dewanbahasa-jdbp和gov.my。数据集分为多个部分,包括训练集和不同语言版本的测试集(如英语、马来语、普通话和泰米尔语)。
This dataset named Malaysian RAG Dataset contains question-answering pairs with associated context documents and evaluation metrics. Each entry includes a source document, a question and an answer generated by ChatGPT-5.1, and quality scores for context precision and faithfulness evaluated using RAGAS. The data is sourced from multiple Malaysia-related websites and datasets, including maktabahalbakri.com, muftiwp.gov.my.dedup, asklegal, dewanbahasa-jdbp, and gov.my. The dataset is divided into several splits, including a training set and test sets in different languages (e.g., English, Malay, Mandarin, and Tamil).
提供机构:
Scicom-intl



