five

nthakur/miracl-raft-eval-instruct

收藏
Hugging Face2024-04-19 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/nthakur/miracl-raft-eval-instruct
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: ar features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 21119270 num_examples: 2896 - name: dev.small num_bytes: 729256.5607734807 num_examples: 100 download_size: 10414806 dataset_size: 21848526.56077348 - config_name: bn features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 4350591 num_examples: 411 - name: dev.small num_bytes: 1058537.9562043797 num_examples: 100 download_size: 1988731 dataset_size: 5409128.95620438 - config_name: de features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev.small num_bytes: 455798.0327868852 num_examples: 100 - name: dev num_bytes: 1390184 num_examples: 305 download_size: 1048951 dataset_size: 1845982.0327868853 - config_name: en features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 4106934 num_examples: 799 - name: dev.small num_bytes: 514009.2615769712 num_examples: 100 download_size: 2561467 dataset_size: 4620943.261576971 - config_name: es features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 2974428 num_examples: 648 - name: dev.small num_bytes: 459016.6666666667 num_examples: 100 download_size: 1983531 dataset_size: 3433444.6666666665 - config_name: fa features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 3608534 num_examples: 632 - name: dev.small num_bytes: 570970.5696202532 num_examples: 100 download_size: 1906481 dataset_size: 4179504.569620253 - config_name: fi features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 5362996 num_examples: 1271 - name: dev.small num_bytes: 421950.90479937056 num_examples: 100 download_size: 3297373 dataset_size: 5784946.90479937 - config_name: fr features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 1438428 num_examples: 343 - name: dev.small num_bytes: 419366.7638483965 num_examples: 100 download_size: 1045572 dataset_size: 1857794.7638483965 - config_name: hi features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 3122567 num_examples: 350 - name: dev.small num_bytes: 892162.0 num_examples: 100 download_size: 1503974 dataset_size: 4014729.0 - config_name: id features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 4504281 num_examples: 960 - name: dev.small num_bytes: 469195.9375 num_examples: 100 download_size: 2674307 dataset_size: 4973476.9375 - config_name: ja features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 4482857 num_examples: 860 - name: dev.small num_bytes: 521262.4418604651 num_examples: 100 download_size: 2731831 dataset_size: 5004119.441860465 - config_name: ko features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 970749 num_examples: 213 - name: dev.small num_bytes: 455750.7042253521 num_examples: 100 download_size: 792868 dataset_size: 1426499.704225352 - config_name: ru features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 11085203 num_examples: 1252 - name: dev.small num_bytes: 885399.6006389776 num_examples: 100 download_size: 5823124 dataset_size: 11970602.600638978 - config_name: sw features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 1797403 num_examples: 482 - name: dev.small num_bytes: 372905.1867219917 num_examples: 100 download_size: 1232620 dataset_size: 2170308.1867219917 - config_name: te features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 2057912 num_examples: 828 - name: dev.small num_bytes: 248540.0966183575 num_examples: 100 download_size: 770401 dataset_size: 2306452.0966183576 - config_name: th features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 7233501 num_examples: 733 - name: dev.small num_bytes: 986835.0613915416 num_examples: 100 download_size: 3043426 dataset_size: 8220336.061391542 - config_name: yo features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev.small num_bytes: 356994.9579831933 num_examples: 100 - name: dev num_bytes: 424824 num_examples: 119 download_size: 450789 dataset_size: 781818.9579831932 - config_name: zh features: - name: query_id dtype: string - name: prompt dtype: string - name: positive_ids sequence: string - name: negative_ids sequence: string splits: - name: dev num_bytes: 1474186 num_examples: 393 - name: dev.small num_bytes: 375110.94147582696 num_examples: 100 download_size: 1154747 dataset_size: 1849296.941475827 configs: - config_name: ar data_files: - split: dev path: ar/dev-* - split: dev.small path: ar/dev.small-* - config_name: bn data_files: - split: dev path: bn/dev-* - split: dev.small path: bn/dev.small-* - config_name: de data_files: - split: dev.small path: de/dev.small-* - split: dev path: de/dev-* - config_name: en data_files: - split: dev path: en/dev-* - split: dev.small path: en/dev.small-* - config_name: es data_files: - split: dev path: es/dev-* - split: dev.small path: es/dev.small-* - config_name: fa data_files: - split: dev path: fa/dev-* - split: dev.small path: fa/dev.small-* - config_name: fi data_files: - split: dev path: fi/dev-* - split: dev.small path: fi/dev.small-* - config_name: fr data_files: - split: dev path: fr/dev-* - split: dev.small path: fr/dev.small-* - config_name: hi data_files: - split: dev path: hi/dev-* - split: dev.small path: hi/dev.small-* - config_name: id data_files: - split: dev path: id/dev-* - split: dev.small path: id/dev.small-* - config_name: ja data_files: - split: dev path: ja/dev-* - split: dev.small path: ja/dev.small-* - config_name: ko data_files: - split: dev path: ko/dev-* - split: dev.small path: ko/dev.small-* - config_name: ru data_files: - split: dev path: ru/dev-* - split: dev.small path: ru/dev.small-* - config_name: sw data_files: - split: dev path: sw/dev-* - split: dev.small path: sw/dev.small-* - config_name: te data_files: - split: dev path: te/dev-* - split: dev.small path: te/dev.small-* - config_name: th data_files: - split: dev path: th/dev-* - split: dev.small path: th/dev.small-* - config_name: yo data_files: - split: dev.small path: yo/dev.small-* - split: dev path: yo/dev-* - config_name: zh data_files: - split: dev path: zh/dev-* - split: dev.small path: zh/dev.small-* ---
提供机构:
nthakur
原始信息汇总

数据集概述

数据集配置及特征

  • 配置名称: 包含多种语言配置,如ar, bn, de, en等。
  • 特征:
    • query_id: 数据类型为string
    • prompt: 数据类型为string
    • positive_ids: 数据类型为sequence,具体为string
    • negative_ids: 数据类型为sequence,具体为string

数据集分割

  • 分割名称: 包括devdev.small两种分割。
  • 数据量:
    • dev: 不同语言的数据量(以字节为单位)和示例数量各不相同。
    • dev.small: 固定为100个示例,数据量(以字节为单位)随语言变化。

数据集大小及下载大小

  • 下载大小: 不同语言的下载大小(以字节为单位)各不相同。
  • 数据集大小: 不同语言的数据集总大小(以字节为单位)各不相同。

数据文件路径

  • 路径: 每种语言的数据文件路径根据分割类型(devdev.small)和语言配置有所不同,路径格式为[语言]/[分割类型]-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作