khazic/CrossLingMind_DataSet
收藏Hugging Face2025-10-16 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/khazic/CrossLingMind_DataSet
下载链接
链接失效反馈官方服务:
资源简介:
CrossLingMind是一个面向东南亚语言的多语种问答数据集,包含了AlpacaEval、GPQA、LiveQA和Math500四个常见基准测试的翻译问答对。这些问答对被翻译成了8种东南亚语言,同时保留了原始的英文问答对,以便进行跨语种的评估或微调。数据集的每个语言文件都遵循统一的JSON格式,其中包含了翻译后的问答对以及对应的原始英文问答对。
CrossLingMind is a multilingual question-answering dataset for Southeast Asian languages, including translations of question-answer pairs for four common benchmarks (AlpacaEval, GPQA, LiveQA, Math500). These question-answer pairs are translated into 8 Southeast Asian languages, while preserving the original English questions/answers for cross-lingual evaluation or fine-tuning. Each language file in the dataset follows a unified JSON format, containing translated question-answer pairs and the corresponding original English question-answer pairs.
提供机构:
khazic



