Ayushnangia/autotrain-data-qa_context
收藏AutoTrain Dataset for project: qa_context
数据集描述
该数据集由AutoTrain自动处理,用于qa_context项目。
语言
数据集的语言BCP-47代码为en。
数据集结构
数据实例
数据集的一个样本如下所示:
json [ { "context": "When Richard was five years old, his mother gave birth to a younger brother, but this brother died at four weeks of age. Four years later, Richard gained a sister, Joan, and the family moved to Far Rockaway, Queens. Though separated by nine years, Joan and Richard were close, as they both shared a natural curiosity about the world. Their mother thought that women did not have the cranial capacity to comprehend such things. Despite their mothers disapproval of Joans desire to study astronomy, Richard encouraged his sister to explore the universe. Joan eventually became an astrophysicist specializing in interactions between the Earth and the solar wind.", "question": "Who was the one that pushed Richard to explore the universe?", "answers.text": [ "" ], "answers.answer_start": [ -1 ], "feat_id": [ "5a8dc945df8bba001a0f9c1c" ], "feat_title": [ "Richard_Feynman" ] }, { "context": "Until the 16th century, the Low Countries u2013 corresponding roughly to the present-day Netherlands, Belgium, and Luxembourg u2013 consisted of a number of duchies, counties, and Prince-bishoprics, almost all of which were under the supremacy of the Holy Roman Empire, with the exception of the county of Flanders, which was under the Kingdom of France.", "question": "What three countries were under the Kingdom of France?", "answers.text": [ "" ], "answers.answer_start": [ -1 ], "feat_id": [ "5a1c8751b4fb5d001871465e" ], "feat_title": [ "Dutch_Republic" ] } ]
数据集字段
数据集包含以下字段(也称为“特征”):
json { "context": "Value(dtype=string, id=None)", "question": "Value(dtype=string, id=None)", "answers.text": "Sequence(feature=Value(dtype=string, id=None), length=-1, id=None)", "answers.answer_start": "Sequence(feature=Value(dtype=int32, id=None), length=-1, id=None)", "feat_id": "Sequence(feature=Value(dtype=string, id=None), length=-1, id=None)", "feat_title": "Sequence(feature=Value(dtype=string, id=None), length=-1, id=None)" }
数据集分割
该数据集分为训练集和验证集。分割大小如下:
| 分割名称 | 样本数量 |
|---|---|
| train | 104204 |
| valid | 26051 |



