belebele-fleurs-train-val-text
收藏Belebele-Fleurs Training and Validation Data
数据集概述
- 语言: 英语 (en)
- 数据集类型: 训练和验证数据集
- 适用任务: 多选分类任务的微调
数据集结构
特征 (Features)
- dataset: 数据集名称 (string)
- passage_id: 段落ID (string)
- question_id: 问题ID (string)
- passage: 段落内容 (string)
- question: 问题内容 (string)
- answer1: 选项1 (string)
- answer2: 选项2 (string)
- answer3: 选项3 (string)
- answer4: 选项4 (string)
- correct_answer: 正确答案 (string)
- correct_answer_num: 正确答案编号 (int64)
- index_level_0: 索引级别0 (int64)
数据分割 (Splits)
- train: 训练集,包含67541个样本,大小为95520563字节
- validation: 验证集,包含3773个样本,大小为5458773字节
数据文件 (Data Files)
- train: 训练集数据文件路径为
data/train-* - validation: 验证集数据文件路径为
data/validation-*
数据集大小
- 下载大小: 36828848字节
- 数据集总大小: 100979336字节
示例
python from datasets import load_dataset
dataset = load_dataset("WueNLP/belebele-fleurs-train-val-text", split="train") print(dataset[0])
{dataset: sciQ,
passage_id: 5635,
question_id: 0,
passage: Mesophiles grow best in moderate temperature, typically between 25°C and 40 °C (77°F and 104°F). Mesophiles are often found living in or on the bodies of humans or other animals. The optimal growth temperature of many pathogenic mesophiles is 37°C (98° F), the normal human body temperature. Mesophilic organisms have important uses in food preparation, including cheese, yogurt, beer and wine.,
question: What type of organism is commonly used in preparation of foods such as che ese and yogurt?,
answer1: viruses,
answer2: mesophilic organisms,
answer3: gymnosperms,
answer4: protozoa,
correct_answer: mesophilic organisms,
correct_answer_num: 2,
index_level_0: 0}
引用
@inproceedings{bandarkar-etal-2024-belebele, title = "The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants", author = "Bandarkar, Lucas and Liang, Davis and Muller, Benjamin and Artetxe, Mikel and Shukla, Satya Narayan and Husa, Donald and Goyal, Naman and Krishnan, Abhinandan and Zettlemoyer, Luke and Khabsa, Madian", booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = aug, year = "2024", address = "Bangkok, Thailand and virtual meeting", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.acl-long.44", pages = "749--775", }




