hishab/boolq_bn
收藏Hugging Face2025-06-20 更新2025-04-08 收录
下载链接:
https://hf-mirror.com/datasets/hishab/boolq_bn
下载链接
链接失效反馈官方服务:
资源简介:
BoolQ孟加拉语(BN)是一个用于回答是非问题的问答数据集,使用GPT-4生成。该数据集包含15,942个例子,每个条目包括一个三元组:(问题、文章、答案)。问题是从未提示和无约束的环境中自然发生的。输入文章来源于孟加拉语维基百科、Banglapedia和新闻文章,并使用GPT-4生成相应的yes/no问题和答案。数据集经过人工审查和修正,错误率降到了1.33%。尽管努力减少错误,但数据集中可能仍然存在一小部分错误。
BoolQ Bangla (BN) is a question-answering dataset for yes/no questions, generated using GPT-4. The dataset contains 15,942 examples, with each entry consisting of a triplet: (question, passage, answer). The questions are naturally occurring, generated from unprompted and unconstrained settings. Input passages were sourced from Bangla Wikipedia, Banglapedia, and News Articles, and GPT-4 was used to generate corresponding yes/no questions with answers. The dataset was manually reviewed by a human annotator (random samples), revealing an error rate of 1.33%, which was subsequently corrected. While efforts were made to minimize errors, there may still be a small portion of errors remaining in the dataset.
提供机构:
hishab



