bdebayan/autotrain-data-demoqa2
收藏Hugging Face2023-07-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/bdebayan/autotrain-data-demoqa2
下载链接
链接失效反馈官方服务:
资源简介:
---
{}
---
# AutoTrain Dataset for project: demoqa2
## Dataset Description
This dataset has been automatically processed by AutoTrain for project demoqa2.
### Languages
The BCP-47 code for the dataset's language is unk.
## Dataset Structure
### Data Instances
A sample from this dataset looks as follows:
```json
[
{
"context": "After completion of two years from the date of registration, the candidate may be\nconverted from JRF to SRF subject to the fulfillment of the following criteria.\n1. The candidate has to apply in the Academic section for SRF through Supervisor(s)\nand HoD.\n2. Minimum of one SSCI/SCOPUS/SCI/SCIE indexed Journal publication/accepted or\none granted Patent is required.\n3. The candidate has to appear for a progress seminar. For that, a committee has to be\nformed with an external member by the Supervisor(s) through the respective HoD for\nevaluation of the progress of the Junior Research Fellow (JRF) for the last two years.",
"question": "What are the criteria's for converting SRF to JRF?",
"answers.text": [
"1. The candidate has to apply in the Academic section for SRF through Supervisor(s)\nand HoD.\n2. Minimum of one SSCI/SCOPUS/SCI/SCIE indexed Journal publication/accepted or\none granted Patent is required.\n3. The candidate has to appear for a progress seminar. For that, a committee has to be\nformed with an external member by the Supervisor(s) through the respective HoD for\nevaluation of the progress of the Junior Research Fellow (JRF) for the last two years."
],
"answers.answer_start": [
162
],
"feat_answer_id": [
950597
],
"feat_document_id": [
1582265
],
"feat_question_id": [
1063809
],
"feat_answer_end": [
622
],
"feat_answer_category": [
NaN
],
"feat_file_name": [
NaN
]
},
{
"context": "AC/DRC. The chairperson, RAC should conduct the\nexamination (written and or oral) within the first 15 (fifteen) months from the date of\nregistration of PS. The syllabus of the comprehensive examination is based on any three\ncourses recommended by the RAC/DRC. If PS fails the exam on the first try, he or she\nmay be allowed another chance within two months. PS's registration with the Institute\nwill be terminated if he/she fails the exam on his/her second attempt as well.",
"question": "What happen if anyone fail in comprehensive exam in first time?",
"answers.text": [
"e or she\nmay be allowed another chance within two months."
],
"answers.answer_start": [
408
],
"feat_answer_id": [
950679
],
"feat_document_id": [
1582276
],
"feat_question_id": [
1063857
],
"feat_answer_end": [
465
],
"feat_answer_category": [
NaN
],
"feat_file_name": [
NaN
]
}
]
```
### Dataset Fields
The dataset has the following fields (also called "features"):
```json
{
"context": "Value(dtype='string', id=None)",
"question": "Value(dtype='string', id=None)",
"answers.text": "Sequence(feature=Value(dtype='string', id=None), length=-1, id=None)",
"answers.answer_start": "Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None)",
"feat_answer_id": "Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None)",
"feat_document_id": "Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None)",
"feat_question_id": "Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None)",
"feat_answer_end": "Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None)",
"feat_answer_category": "Sequence(feature=Value(dtype='float64', id=None), length=-1, id=None)",
"feat_file_name": "Sequence(feature=Value(dtype='float64', id=None), length=-1, id=None)"
}
```
### Dataset Splits
This dataset is split into a train and validation split. The split sizes are as follow:
| Split name | Num samples |
| ------------ | ------------------- |
| train | 43 |
| valid | 11 |
提供机构:
bdebayan
原始信息汇总
AutoTrain Dataset for project: demoqa2
数据集描述
该数据集是为项目demoqa2自动处理的数据集。
语言
数据集的语言BCP-47代码为unk。
数据集结构
数据实例
数据集的样本结构如下:
json [ { "context": "...", "question": "...", "answers.text": [ "..." ], "answers.answer_start": [ ... ], "feat_answer_id": [ ... ], "feat_document_id": [ ... ], "feat_question_id": [ ... ], "feat_answer_end": [ ... ], "feat_answer_category": [ NaN ], "feat_file_name": [ NaN ] } ]
数据集字段
数据集包含以下字段:
json { "context": "...", "question": "...", "answers.text": "...", "answers.answer_start": "...", "feat_answer_id": "...", "feat_document_id": "...", "feat_question_id": "...", "feat_answer_end": "...", "feat_answer_category": "...", "feat_file_name": "..." }
数据集分割
数据集分为训练集和验证集,分割大小如下:
| 分割名称 | 样本数量 |
|---|---|
| 训练 | 43 |
| 验证 | 11 |



