Cognitive-Lab/GoogleIndicGenBench_xorqa_in
收藏Hugging Face2024-06-03 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/Cognitive-Lab/GoogleIndicGenBench_xorqa_in
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: gu
features:
- name: translated_answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: context
dtype: string
- name: oracle_question
dtype: string
- name: split
dtype: string
- name: title
dtype: string
- name: lang
dtype: string
- name: answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: question
dtype: string
splits:
- name: train
num_bytes: 95606
num_examples: 100
- name: test
num_bytes: 500609
num_examples: 535
- name: dev
num_bytes: 463804
num_examples: 499
download_size: 584545
dataset_size: 1060019
- config_name: hi
features:
- name: translated_answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: context
dtype: string
- name: oracle_question
dtype: string
- name: split
dtype: string
- name: title
dtype: string
- name: lang
dtype: string
- name: answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: question
dtype: string
splits:
- name: train
num_bytes: 95345
num_examples: 100
- name: test
num_bytes: 505163
num_examples: 539
- name: dev
num_bytes: 465260
num_examples: 499
download_size: 587339
dataset_size: 1065768
- config_name: kn
features:
- name: translated_answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: context
dtype: string
- name: oracle_question
dtype: string
- name: split
dtype: string
- name: title
dtype: string
- name: lang
dtype: string
- name: answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: question
dtype: string
splits:
- name: train
num_bytes: 96439
num_examples: 100
- name: test
num_bytes: 506353
num_examples: 536
- name: dev
num_bytes: 470438
num_examples: 500
download_size: 591749
dataset_size: 1073230
- config_name: ml
features:
- name: translated_answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: context
dtype: string
- name: oracle_question
dtype: string
- name: split
dtype: string
- name: title
dtype: string
- name: lang
dtype: string
- name: answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: question
dtype: string
splits:
- name: train
num_bytes: 97760
num_examples: 100
- name: test
num_bytes: 516616
num_examples: 537
- name: dev
num_bytes: 478673
num_examples: 500
download_size: 601968
dataset_size: 1093049
- config_name: mr
features:
- name: translated_answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: context
dtype: string
- name: oracle_question
dtype: string
- name: split
dtype: string
- name: title
dtype: string
- name: lang
dtype: string
- name: answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: question
dtype: string
splits:
- name: train
num_bytes: 96115
num_examples: 100
- name: test
num_bytes: 505346
num_examples: 538
- name: dev
num_bytes: 467131
num_examples: 498
download_size: 592818
dataset_size: 1068592
- config_name: ta
features:
- name: translated_answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: context
dtype: string
- name: oracle_question
dtype: string
- name: split
dtype: string
- name: title
dtype: string
- name: lang
dtype: string
- name: answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: question
dtype: string
splits:
- name: train
num_bytes: 98324
num_examples: 100
- name: test
num_bytes: 520219
num_examples: 538
- name: dev
num_bytes: 481205
num_examples: 500
download_size: 594272
dataset_size: 1099748
- config_name: te
features:
- name: translated_answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: context
dtype: string
- name: oracle_question
dtype: string
- name: split
dtype: string
- name: title
dtype: string
- name: lang
dtype: string
- name: answers
list:
- name: answer_start
dtype: int64
- name: text
dtype: string
- name: question
dtype: string
splits:
- name: train
num_bytes: 95599
num_examples: 100
- name: test
num_bytes: 499460
num_examples: 539
- name: dev
num_bytes: 469737
num_examples: 500
download_size: 585197
dataset_size: 1064796
configs:
- config_name: gu
data_files:
- split: train
path: gu/train-*
- split: test
path: gu/test-*
- split: dev
path: gu/dev-*
- config_name: hi
data_files:
- split: train
path: hi/train-*
- split: test
path: hi/test-*
- split: dev
path: hi/dev-*
- config_name: kn
data_files:
- split: train
path: kn/train-*
- split: test
path: kn/test-*
- split: dev
path: kn/dev-*
- config_name: ml
data_files:
- split: train
path: ml/train-*
- split: test
path: ml/test-*
- split: dev
path: ml/dev-*
- config_name: mr
data_files:
- split: train
path: mr/train-*
- split: test
path: mr/test-*
- split: dev
path: mr/dev-*
- config_name: ta
data_files:
- split: train
path: ta/train-*
- split: test
path: ta/test-*
- split: dev
path: ta/dev-*
- config_name: te
data_files:
- split: train
path: te/train-*
- split: test
path: te/test-*
- split: dev
path: te/dev-*
---
This dataset includes multiple configurations for different languages, each featuring translated_answers, context, oracle_question, split, title, lang, answers, and question. The dataset is divided into train, test, and dev sets for each language configuration.
提供机构:
Cognitive-Lab
原始信息汇总
数据集概述
数据集配置及特征
- 配置名称: gu, hi, kn, ml, mr, ta, te
- 特征:
- translated_answers:
- answer_start: int64
- text: string
- context: string
- oracle_question: string
- split: string
- title: string
- lang: string
- answers:
- answer_start: int64
- text: string
- question: string
- translated_answers:
数据集分割及大小
- 分割: train, test, dev
- 大小:
- gu:
- train: 95606 bytes, 100 examples
- test: 500609 bytes, 535 examples
- dev: 463804 bytes, 499 examples
- download_size: 584545 bytes
- dataset_size: 1060019 bytes
- hi:
- train: 95345 bytes, 100 examples
- test: 505163 bytes, 539 examples
- dev: 465260 bytes, 499 examples
- download_size: 587339 bytes
- dataset_size: 1065768 bytes
- kn:
- train: 96439 bytes, 100 examples
- test: 506353 bytes, 536 examples
- dev: 470438 bytes, 500 examples
- download_size: 591749 bytes
- dataset_size: 1073230 bytes
- ml:
- train: 97760 bytes, 100 examples
- test: 516616 bytes, 537 examples
- dev: 478673 bytes, 500 examples
- download_size: 601968 bytes
- dataset_size: 1093049 bytes
- mr:
- train: 96115 bytes, 100 examples
- test: 505346 bytes, 538 examples
- dev: 467131 bytes, 498 examples
- download_size: 592818 bytes
- dataset_size: 1068592 bytes
- ta:
- train: 98324 bytes, 100 examples
- test: 520219 bytes, 538 examples
- dev: 481205 bytes, 500 examples
- download_size: 594272 bytes
- dataset_size: 1099748 bytes
- te:
- train: 95599 bytes, 100 examples
- test: 499460 bytes, 539 examples
- dev: 469737 bytes, 500 examples
- download_size: 585197 bytes
- dataset_size: 1064796 bytes
- gu:
数据文件路径
- gu:
- train: gu/train-*
- test: gu/test-*
- dev: gu/dev-*
- hi:
- train: hi/train-*
- test: hi/test-*
- dev: hi/dev-*
- kn:
- train: kn/train-*
- test: kn/test-*
- dev: kn/dev-*
- ml:
- train: ml/train-*
- test: ml/test-*
- dev: ml/dev-*
- mr:
- train: mr/train-*
- test: mr/test-*
- dev: mr/dev-*
- ta:
- train: ta/train-*
- test: ta/test-*
- dev: ta/dev-*
- te:
- train: te/train-*
- test: te/test-*
- dev: te/dev-*



