five

OALL/AlGhafa-Arabic-LLM-Benchmark-Translated

收藏
Hugging Face2024-03-31 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/OALL/AlGhafa-Arabic-LLM-Benchmark-Translated
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: arc_challenge_okapi_ar features: - name: query dtype: string - name: sol1 dtype: string - name: sol2 dtype: string - name: sol3 dtype: string - name: sol4 dtype: string - name: label dtype: int64 splits: - name: test num_bytes: 478407 num_examples: 1160 - name: validation num_bytes: 1780 num_examples: 5 download_size: 263684 dataset_size: 480187 - config_name: arc_easy_ar features: - name: query dtype: string - name: sol1 dtype: string - name: sol2 dtype: string - name: sol3 dtype: string - name: sol4 dtype: string - name: label dtype: int64 splits: - name: test num_bytes: 832686 num_examples: 2364 - name: validation num_bytes: 1712 num_examples: 5 download_size: 443177 dataset_size: 834398 - config_name: boolq_ar features: - name: question dtype: string - name: passage dtype: string - name: answer dtype: bool splits: - name: test num_bytes: 3102514 num_examples: 3260 - name: validation num_bytes: 3499 num_examples: 5 download_size: 1581745 dataset_size: 3106013 - config_name: copa_ext_ar features: - name: premise dtype: string - name: choice1 dtype: string - name: choice2 dtype: string - name: question dtype: string - name: label dtype: int64 splits: - name: test num_bytes: 14534 num_examples: 90 - name: validation num_bytes: 828 num_examples: 5 download_size: 15714 dataset_size: 15362 - config_name: hellaswag_okapi_ar features: - name: ind dtype: int64 - name: activity_label dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: ctx dtype: string - name: endings dtype: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: int64 splits: - name: test num_bytes: 15045582 num_examples: 9171 - name: validation num_bytes: 8730 num_examples: 5 download_size: 7411269 dataset_size: 15054312 - config_name: mmlu_okapi_ar features: - name: query dtype: string - name: sol1 dtype: string - name: sol2 dtype: string - name: sol3 dtype: string - name: sol4 dtype: string - name: label dtype: int64 splits: - name: test num_bytes: 7847650 num_examples: 12923 - name: validation num_bytes: 3506 num_examples: 5 download_size: 4233486 dataset_size: 7851156 - config_name: openbook_qa_ext_ar features: - name: query dtype: string - name: sol1 dtype: string - name: sol2 dtype: string - name: sol3 dtype: string - name: sol4 dtype: string - name: label dtype: int64 splits: - name: test num_bytes: 111600 num_examples: 495 - name: validation num_bytes: 1442 num_examples: 5 download_size: 71738 dataset_size: 113042 - config_name: piqa_ar features: - name: query dtype: string - name: sol1 dtype: string - name: sol2 dtype: string - name: label dtype: int64 splits: - name: test num_bytes: 717917 num_examples: 1833 - name: validation num_bytes: 1367 num_examples: 5 download_size: 383879 dataset_size: 719284 - config_name: race_ar features: - name: query dtype: string - name: sol1 dtype: string - name: sol2 dtype: string - name: sol3 dtype: string - name: sol4 dtype: string - name: label dtype: int64 splits: - name: test num_bytes: 13500405 num_examples: 4929 - name: validation num_bytes: 13808 num_examples: 5 download_size: 3426208 dataset_size: 13514213 - config_name: sciq_ar features: - name: question dtype: string - name: distractor3 dtype: string - name: distractor1 dtype: string - name: distractor2 dtype: string - name: correct_answer dtype: string - name: support dtype: string splits: - name: test num_bytes: 880972 num_examples: 995 - name: validation num_bytes: 4764 num_examples: 5 download_size: 439660 dataset_size: 885736 - config_name: toxigen_ar features: - name: text dtype: string - name: target_group dtype: string - name: factual? dtype: string - name: ingroup_effect dtype: string - name: lewd dtype: string - name: framing dtype: string - name: predicted_group dtype: string - name: stereotyping dtype: string - name: intent dtype: float64 - name: toxicity_ai dtype: float64 - name: toxicity_human dtype: float64 - name: predicted_author dtype: string - name: actual_method dtype: string splits: - name: test num_bytes: 540217 num_examples: 935 - name: validation num_bytes: 3029 num_examples: 5 download_size: 109449 dataset_size: 543246 configs: - config_name: arc_challenge_okapi_ar data_files: - split: test path: arc_challenge_okapi_ar/test-* - split: validation path: arc_challenge_okapi_ar/validation-* - config_name: arc_easy_ar data_files: - split: test path: arc_easy_ar/test-* - split: validation path: arc_easy_ar/validation-* - config_name: boolq_ar data_files: - split: test path: boolq_ar/test-* - split: validation path: boolq_ar/validation-* - config_name: copa_ext_ar data_files: - split: test path: copa_ext_ar/test-* - split: validation path: copa_ext_ar/validation-* - config_name: hellaswag_okapi_ar data_files: - split: test path: hellaswag_okapi_ar/test-* - split: validation path: hellaswag_okapi_ar/validation-* - config_name: mmlu_okapi_ar data_files: - split: test path: mmlu_okapi_ar/test-* - split: validation path: mmlu_okapi_ar/validation-* - config_name: openbook_qa_ext_ar data_files: - split: test path: openbook_qa_ext_ar/test-* - split: validation path: openbook_qa_ext_ar/validation-* - config_name: piqa_ar data_files: - split: test path: piqa_ar/test-* - split: validation path: piqa_ar/validation-* - config_name: race_ar data_files: - split: test path: race_ar/test-* - split: validation path: race_ar/validation-* - config_name: sciq_ar data_files: - split: test path: sciq_ar/test-* - split: validation path: sciq_ar/validation-* - config_name: toxigen_ar data_files: - split: test path: toxigen_ar/test-* - split: validation path: toxigen_ar/validation-* ---
提供机构:
OALL
原始信息汇总

数据集概述

数据集配置

arc_challenge_okapi_ar

  • 特征:
    • query: string
    • sol1: string
    • sol2: string
    • sol3: string
    • sol4: string
    • label: int64
  • 分割:
    • test: 478407 字节, 1160 样本
    • validation: 1780 字节, 5 样本
  • 下载大小: 263684 字节
  • 数据集大小: 480187 字节

arc_easy_ar

  • 特征:
    • query: string
    • sol1: string
    • sol2: string
    • sol3: string
    • sol4: string
    • label: int64
  • 分割:
    • test: 832686 字节, 2364 样本
    • validation: 1712 字节, 5 样本
  • 下载大小: 443177 字节
  • 数据集大小: 834398 字节

boolq_ar

  • 特征:
    • question: string
    • passage: string
    • answer: bool
  • 分割:
    • test: 3102514 字节, 3260 样本
    • validation: 3499 字节, 5 样本
  • 下载大小: 1581745 字节
  • 数据集大小: 3106013 字节

copa_ext_ar

  • 特征:
    • premise: string
    • choice1: string
    • choice2: string
    • question: string
    • label: int64
  • 分割:
    • test: 14534 字节, 90 样本
    • validation: 828 字节, 5 样本
  • 下载大小: 15714 字节
  • 数据集大小: 15362 字节

hellaswag_okapi_ar

  • 特征:
    • ind: int64
    • activity_label: string
    • ctx_a: string
    • ctx_b: string
    • ctx: string
    • endings: string
    • source_id: string
    • split: string
    • split_type: string
    • label: int64
  • 分割:
    • test: 15045582 字节, 9171 样本
    • validation: 8730 字节, 5 样本
  • 下载大小: 7411269 字节
  • 数据集大小: 15054312 字节

mmlu_okapi_ar

  • 特征:
    • query: string
    • sol1: string
    • sol2: string
    • sol3: string
    • sol4: string
    • label: int64
  • 分割:
    • test: 7847650 字节, 12923 样本
    • validation: 3506 字节, 5 样本
  • 下载大小: 4233486 字节
  • 数据集大小: 7851156 字节

openbook_qa_ext_ar

  • 特征:
    • query: string
    • sol1: string
    • sol2: string
    • sol3: string
    • sol4: string
    • label: int64
  • 分割:
    • test: 111600 字节, 495 样本
    • validation: 1442 字节, 5 样本
  • 下载大小: 71738 字节
  • 数据集大小: 113042 字节

piqa_ar

  • 特征:
    • query: string
    • sol1: string
    • sol2: string
    • label: int64
  • 分割:
    • test: 717917 字节, 1833 样本
    • validation: 1367 字节, 5 样本
  • 下载大小: 383879 字节
  • 数据集大小: 719284 字节

race_ar

  • 特征:
    • query: string
    • sol1: string
    • sol2: string
    • sol3: string
    • sol4: string
    • label: int64
  • 分割:
    • test: 13500405 字节, 4929 样本
    • validation: 13808 字节, 5 样本
  • 下载大小: 3426208 字节
  • 数据集大小: 13514213 字节

sciq_ar

  • 特征:
    • question: string
    • distractor3: string
    • distractor1: string
    • distractor2: string
    • correct_answer: string
    • support: string
  • 分割:
    • test: 880972 字节, 995 样本
    • validation: 4764 字节, 5 样本
  • 下载大小: 439660 字节
  • 数据集大小: 885736 字节

toxigen_ar

  • 特征:
    • text: string
    • target_group: string
    • factual?: string
    • ingroup_effect: string
    • lewd: string
    • framing: string
    • predicted_group: string
    • stereotyping: string
    • intent: float64
    • toxicity_ai: float64
    • toxicity_human: float64
    • predicted_author: string
    • actual_method: string
  • 分割:
    • test: 540217 字节, 935 样本
    • validation: 3029 字节, 5 样本
  • 下载大小: 109449 字节
  • 数据集大小: 543246 字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作