five

Ethiopian Family Code QA Dataset

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/hj8m6mff8c
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains collection of question-and-answer pairs that have been collected in two ways. The first method is to manually extract the question-and-answer pair from the revised family code of Ethiopia. The data generation process involves a review of each article of the family code of Ethiopia and generating questions and their answer for those question from the article they were extracted from. After the extraction each question-and-answer pair each of them was reviewed by people with domain knowledge to ensure the accuracy of them. Moreover, there was a second-round review to ensure the meaning accuracy of each pair. This dataset is created to fine tune llama-2 model to create a model that will be able to answer questions related with the revised family code of Ethiopia without utilization of any back and forth translation. The second method to generate the question answer pair is to use ChatGPT to generate the pairs. The English version of the Family Code articles was given to ChatGPT as an input, which in turn generated relevant questions and answers based on the content of the family code. These question-answer pairs were then translated from English to the Amharic using Google Translate, the translated dataset was manually reviewed by Amharic speaking team members to validate the quality of the translations and correct mistakes that were made during the translation process.
创建时间:
2024-09-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作