five

ellamind/wikipedia-2023-11-retrieval-multilingual-queries

收藏
Hugging Face2024-05-22 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/ellamind/wikipedia-2023-11-retrieval-multilingual-queries
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: bg features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 208882 num_examples: 1500 download_size: 103568 dataset_size: 208882 - config_name: bn features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 234505 num_examples: 1500 download_size: 102235 dataset_size: 234505 - config_name: cs features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 135909 num_examples: 1500 download_size: 89207 dataset_size: 135909 - config_name: da features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 129877 num_examples: 1500 download_size: 80466 dataset_size: 129877 - config_name: de features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 150329 num_examples: 1500 download_size: 95819 dataset_size: 150329 - config_name: en features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 147453 num_examples: 1500 download_size: 91794 dataset_size: 147453 - config_name: fa features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 178590 num_examples: 1500 download_size: 92453 dataset_size: 178590 - config_name: fi features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 128975 num_examples: 1500 download_size: 81031 dataset_size: 128975 - config_name: hi features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 241290 num_examples: 1500 download_size: 105433 dataset_size: 241290 - config_name: it features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 150403 num_examples: 1500 download_size: 92232 dataset_size: 150403 - config_name: nl features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 141497 num_examples: 1500 download_size: 86179 dataset_size: 141497 - config_name: 'no' features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 127581 num_examples: 1500 download_size: 79476 dataset_size: 127581 - config_name: pt features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 144633 num_examples: 1500 download_size: 89646 dataset_size: 144633 - config_name: ro features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 140584 num_examples: 1500 download_size: 88354 dataset_size: 140584 - config_name: sr features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 185375 num_examples: 1500 download_size: 102533 dataset_size: 185375 - config_name: sv features: - name: _id dtype: string - name: text dtype: string splits: - name: test num_bytes: 132949 num_examples: 1500 download_size: 83862 dataset_size: 132949 configs: - config_name: bg data_files: - split: test path: bg/test-* - config_name: bn data_files: - split: test path: bn/test-* - config_name: cs data_files: - split: test path: cs/test-* - config_name: da data_files: - split: test path: da/test-* - config_name: de data_files: - split: test path: de/test-* - config_name: en data_files: - split: test path: en/test-* - config_name: fa data_files: - split: test path: fa/test-* - config_name: fi data_files: - split: test path: fi/test-* - config_name: hi data_files: - split: test path: hi/test-* - config_name: it data_files: - split: test path: it/test-* - config_name: nl data_files: - split: test path: nl/test-* - config_name: 'no' data_files: - split: test path: no/test-* - config_name: pt data_files: - split: test path: pt/test-* - config_name: ro data_files: - split: test path: ro/test-* - config_name: sr data_files: - split: test path: sr/test-* - config_name: sv data_files: - split: test path: sv/test-* ---
提供机构:
ellamind
原始信息汇总

数据集概述

数据集配置信息

配置名称 特征 分割 字节数 示例数 下载大小 数据集大小
bg _id: string, text: string test 208882 1500 103568 208882
bn _id: string, text: string test 234505 1500 102235 234505
cs _id: string, text: string test 135909 1500 89207 135909
da _id: string, text: string test 129877 1500 80466 129877
de _id: string, text: string test 150329 1500 95819 150329
en _id: string, text: string test 147453 1500 91794 147453
fa _id: string, text: string test 178590 1500 92453 178590
fi _id: string, text: string test 128975 1500 81031 128975
hi _id: string, text: string test 241290 1500 105433 241290
it _id: string, text: string test 150403 1500 92232 150403
nl _id: string, text: string test 141497 1500 86179 141497
no _id: string, text: string test 127581 1500 79476 127581
pt _id: string, text: string test 144633 1500 89646 144633
ro _id: string, text: string test 140584 1500 88354 140584
sr _id: string, text: string test 185375 1500 102533 185375
sv _id: string, text: string test 132949 1500 83862 132949

数据文件路径

配置名称 分割 路径
bg test bg/test-*
bn test bn/test-*
cs test cs/test-*
da test da/test-*
de test de/test-*
en test en/test-*
fa test fa/test-*
fi test fi/test-*
hi test hi/test-*
it test it/test-*
nl test nl/test-*
no test no/test-*
pt test pt/test-*
ro test ro/test-*
sr test sr/test-*
sv test sv/test-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作