OALL/AlGhafa-Arabic-LLM-Benchmark-Translated
收藏Hugging Face2024-03-31 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/OALL/AlGhafa-Arabic-LLM-Benchmark-Translated
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: arc_challenge_okapi_ar
features:
- name: query
dtype: string
- name: sol1
dtype: string
- name: sol2
dtype: string
- name: sol3
dtype: string
- name: sol4
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 478407
num_examples: 1160
- name: validation
num_bytes: 1780
num_examples: 5
download_size: 263684
dataset_size: 480187
- config_name: arc_easy_ar
features:
- name: query
dtype: string
- name: sol1
dtype: string
- name: sol2
dtype: string
- name: sol3
dtype: string
- name: sol4
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 832686
num_examples: 2364
- name: validation
num_bytes: 1712
num_examples: 5
download_size: 443177
dataset_size: 834398
- config_name: boolq_ar
features:
- name: question
dtype: string
- name: passage
dtype: string
- name: answer
dtype: bool
splits:
- name: test
num_bytes: 3102514
num_examples: 3260
- name: validation
num_bytes: 3499
num_examples: 5
download_size: 1581745
dataset_size: 3106013
- config_name: copa_ext_ar
features:
- name: premise
dtype: string
- name: choice1
dtype: string
- name: choice2
dtype: string
- name: question
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 14534
num_examples: 90
- name: validation
num_bytes: 828
num_examples: 5
download_size: 15714
dataset_size: 15362
- config_name: hellaswag_okapi_ar
features:
- name: ind
dtype: int64
- name: activity_label
dtype: string
- name: ctx_a
dtype: string
- name: ctx_b
dtype: string
- name: ctx
dtype: string
- name: endings
dtype: string
- name: source_id
dtype: string
- name: split
dtype: string
- name: split_type
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 15045582
num_examples: 9171
- name: validation
num_bytes: 8730
num_examples: 5
download_size: 7411269
dataset_size: 15054312
- config_name: mmlu_okapi_ar
features:
- name: query
dtype: string
- name: sol1
dtype: string
- name: sol2
dtype: string
- name: sol3
dtype: string
- name: sol4
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 7847650
num_examples: 12923
- name: validation
num_bytes: 3506
num_examples: 5
download_size: 4233486
dataset_size: 7851156
- config_name: openbook_qa_ext_ar
features:
- name: query
dtype: string
- name: sol1
dtype: string
- name: sol2
dtype: string
- name: sol3
dtype: string
- name: sol4
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 111600
num_examples: 495
- name: validation
num_bytes: 1442
num_examples: 5
download_size: 71738
dataset_size: 113042
- config_name: piqa_ar
features:
- name: query
dtype: string
- name: sol1
dtype: string
- name: sol2
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 717917
num_examples: 1833
- name: validation
num_bytes: 1367
num_examples: 5
download_size: 383879
dataset_size: 719284
- config_name: race_ar
features:
- name: query
dtype: string
- name: sol1
dtype: string
- name: sol2
dtype: string
- name: sol3
dtype: string
- name: sol4
dtype: string
- name: label
dtype: int64
splits:
- name: test
num_bytes: 13500405
num_examples: 4929
- name: validation
num_bytes: 13808
num_examples: 5
download_size: 3426208
dataset_size: 13514213
- config_name: sciq_ar
features:
- name: question
dtype: string
- name: distractor3
dtype: string
- name: distractor1
dtype: string
- name: distractor2
dtype: string
- name: correct_answer
dtype: string
- name: support
dtype: string
splits:
- name: test
num_bytes: 880972
num_examples: 995
- name: validation
num_bytes: 4764
num_examples: 5
download_size: 439660
dataset_size: 885736
- config_name: toxigen_ar
features:
- name: text
dtype: string
- name: target_group
dtype: string
- name: factual?
dtype: string
- name: ingroup_effect
dtype: string
- name: lewd
dtype: string
- name: framing
dtype: string
- name: predicted_group
dtype: string
- name: stereotyping
dtype: string
- name: intent
dtype: float64
- name: toxicity_ai
dtype: float64
- name: toxicity_human
dtype: float64
- name: predicted_author
dtype: string
- name: actual_method
dtype: string
splits:
- name: test
num_bytes: 540217
num_examples: 935
- name: validation
num_bytes: 3029
num_examples: 5
download_size: 109449
dataset_size: 543246
configs:
- config_name: arc_challenge_okapi_ar
data_files:
- split: test
path: arc_challenge_okapi_ar/test-*
- split: validation
path: arc_challenge_okapi_ar/validation-*
- config_name: arc_easy_ar
data_files:
- split: test
path: arc_easy_ar/test-*
- split: validation
path: arc_easy_ar/validation-*
- config_name: boolq_ar
data_files:
- split: test
path: boolq_ar/test-*
- split: validation
path: boolq_ar/validation-*
- config_name: copa_ext_ar
data_files:
- split: test
path: copa_ext_ar/test-*
- split: validation
path: copa_ext_ar/validation-*
- config_name: hellaswag_okapi_ar
data_files:
- split: test
path: hellaswag_okapi_ar/test-*
- split: validation
path: hellaswag_okapi_ar/validation-*
- config_name: mmlu_okapi_ar
data_files:
- split: test
path: mmlu_okapi_ar/test-*
- split: validation
path: mmlu_okapi_ar/validation-*
- config_name: openbook_qa_ext_ar
data_files:
- split: test
path: openbook_qa_ext_ar/test-*
- split: validation
path: openbook_qa_ext_ar/validation-*
- config_name: piqa_ar
data_files:
- split: test
path: piqa_ar/test-*
- split: validation
path: piqa_ar/validation-*
- config_name: race_ar
data_files:
- split: test
path: race_ar/test-*
- split: validation
path: race_ar/validation-*
- config_name: sciq_ar
data_files:
- split: test
path: sciq_ar/test-*
- split: validation
path: sciq_ar/validation-*
- config_name: toxigen_ar
data_files:
- split: test
path: toxigen_ar/test-*
- split: validation
path: toxigen_ar/validation-*
---
提供机构:
OALL
原始信息汇总
数据集概述
数据集配置
arc_challenge_okapi_ar
- 特征:
query: stringsol1: stringsol2: stringsol3: stringsol4: stringlabel: int64
- 分割:
test: 478407 字节, 1160 样本validation: 1780 字节, 5 样本
- 下载大小: 263684 字节
- 数据集大小: 480187 字节
arc_easy_ar
- 特征:
query: stringsol1: stringsol2: stringsol3: stringsol4: stringlabel: int64
- 分割:
test: 832686 字节, 2364 样本validation: 1712 字节, 5 样本
- 下载大小: 443177 字节
- 数据集大小: 834398 字节
boolq_ar
- 特征:
question: stringpassage: stringanswer: bool
- 分割:
test: 3102514 字节, 3260 样本validation: 3499 字节, 5 样本
- 下载大小: 1581745 字节
- 数据集大小: 3106013 字节
copa_ext_ar
- 特征:
premise: stringchoice1: stringchoice2: stringquestion: stringlabel: int64
- 分割:
test: 14534 字节, 90 样本validation: 828 字节, 5 样本
- 下载大小: 15714 字节
- 数据集大小: 15362 字节
hellaswag_okapi_ar
- 特征:
ind: int64activity_label: stringctx_a: stringctx_b: stringctx: stringendings: stringsource_id: stringsplit: stringsplit_type: stringlabel: int64
- 分割:
test: 15045582 字节, 9171 样本validation: 8730 字节, 5 样本
- 下载大小: 7411269 字节
- 数据集大小: 15054312 字节
mmlu_okapi_ar
- 特征:
query: stringsol1: stringsol2: stringsol3: stringsol4: stringlabel: int64
- 分割:
test: 7847650 字节, 12923 样本validation: 3506 字节, 5 样本
- 下载大小: 4233486 字节
- 数据集大小: 7851156 字节
openbook_qa_ext_ar
- 特征:
query: stringsol1: stringsol2: stringsol3: stringsol4: stringlabel: int64
- 分割:
test: 111600 字节, 495 样本validation: 1442 字节, 5 样本
- 下载大小: 71738 字节
- 数据集大小: 113042 字节
piqa_ar
- 特征:
query: stringsol1: stringsol2: stringlabel: int64
- 分割:
test: 717917 字节, 1833 样本validation: 1367 字节, 5 样本
- 下载大小: 383879 字节
- 数据集大小: 719284 字节
race_ar
- 特征:
query: stringsol1: stringsol2: stringsol3: stringsol4: stringlabel: int64
- 分割:
test: 13500405 字节, 4929 样本validation: 13808 字节, 5 样本
- 下载大小: 3426208 字节
- 数据集大小: 13514213 字节
sciq_ar
- 特征:
question: stringdistractor3: stringdistractor1: stringdistractor2: stringcorrect_answer: stringsupport: string
- 分割:
test: 880972 字节, 995 样本validation: 4764 字节, 5 样本
- 下载大小: 439660 字节
- 数据集大小: 885736 字节
toxigen_ar
- 特征:
text: stringtarget_group: stringfactual?: stringingroup_effect: stringlewd: stringframing: stringpredicted_group: stringstereotyping: stringintent: float64toxicity_ai: float64toxicity_human: float64predicted_author: stringactual_method: string
- 分割:
test: 540217 字节, 935 样本validation: 3029 字节, 5 样本
- 下载大小: 109449 字节
- 数据集大小: 543246 字节



