ellamind/wikipedia-2023-11-retrieval-multilingual-queries
收藏Hugging Face2024-05-22 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/ellamind/wikipedia-2023-11-retrieval-multilingual-queries
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: bg
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 208882
num_examples: 1500
download_size: 103568
dataset_size: 208882
- config_name: bn
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 234505
num_examples: 1500
download_size: 102235
dataset_size: 234505
- config_name: cs
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 135909
num_examples: 1500
download_size: 89207
dataset_size: 135909
- config_name: da
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 129877
num_examples: 1500
download_size: 80466
dataset_size: 129877
- config_name: de
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 150329
num_examples: 1500
download_size: 95819
dataset_size: 150329
- config_name: en
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 147453
num_examples: 1500
download_size: 91794
dataset_size: 147453
- config_name: fa
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 178590
num_examples: 1500
download_size: 92453
dataset_size: 178590
- config_name: fi
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 128975
num_examples: 1500
download_size: 81031
dataset_size: 128975
- config_name: hi
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 241290
num_examples: 1500
download_size: 105433
dataset_size: 241290
- config_name: it
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 150403
num_examples: 1500
download_size: 92232
dataset_size: 150403
- config_name: nl
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 141497
num_examples: 1500
download_size: 86179
dataset_size: 141497
- config_name: 'no'
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 127581
num_examples: 1500
download_size: 79476
dataset_size: 127581
- config_name: pt
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 144633
num_examples: 1500
download_size: 89646
dataset_size: 144633
- config_name: ro
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 140584
num_examples: 1500
download_size: 88354
dataset_size: 140584
- config_name: sr
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 185375
num_examples: 1500
download_size: 102533
dataset_size: 185375
- config_name: sv
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: test
num_bytes: 132949
num_examples: 1500
download_size: 83862
dataset_size: 132949
configs:
- config_name: bg
data_files:
- split: test
path: bg/test-*
- config_name: bn
data_files:
- split: test
path: bn/test-*
- config_name: cs
data_files:
- split: test
path: cs/test-*
- config_name: da
data_files:
- split: test
path: da/test-*
- config_name: de
data_files:
- split: test
path: de/test-*
- config_name: en
data_files:
- split: test
path: en/test-*
- config_name: fa
data_files:
- split: test
path: fa/test-*
- config_name: fi
data_files:
- split: test
path: fi/test-*
- config_name: hi
data_files:
- split: test
path: hi/test-*
- config_name: it
data_files:
- split: test
path: it/test-*
- config_name: nl
data_files:
- split: test
path: nl/test-*
- config_name: 'no'
data_files:
- split: test
path: no/test-*
- config_name: pt
data_files:
- split: test
path: pt/test-*
- config_name: ro
data_files:
- split: test
path: ro/test-*
- config_name: sr
data_files:
- split: test
path: sr/test-*
- config_name: sv
data_files:
- split: test
path: sv/test-*
---
提供机构:
ellamind
原始信息汇总
数据集概述
数据集配置信息
| 配置名称 | 特征 | 分割 | 字节数 | 示例数 | 下载大小 | 数据集大小 |
|---|---|---|---|---|---|---|
| bg | _id: string, text: string | test | 208882 | 1500 | 103568 | 208882 |
| bn | _id: string, text: string | test | 234505 | 1500 | 102235 | 234505 |
| cs | _id: string, text: string | test | 135909 | 1500 | 89207 | 135909 |
| da | _id: string, text: string | test | 129877 | 1500 | 80466 | 129877 |
| de | _id: string, text: string | test | 150329 | 1500 | 95819 | 150329 |
| en | _id: string, text: string | test | 147453 | 1500 | 91794 | 147453 |
| fa | _id: string, text: string | test | 178590 | 1500 | 92453 | 178590 |
| fi | _id: string, text: string | test | 128975 | 1500 | 81031 | 128975 |
| hi | _id: string, text: string | test | 241290 | 1500 | 105433 | 241290 |
| it | _id: string, text: string | test | 150403 | 1500 | 92232 | 150403 |
| nl | _id: string, text: string | test | 141497 | 1500 | 86179 | 141497 |
| no | _id: string, text: string | test | 127581 | 1500 | 79476 | 127581 |
| pt | _id: string, text: string | test | 144633 | 1500 | 89646 | 144633 |
| ro | _id: string, text: string | test | 140584 | 1500 | 88354 | 140584 |
| sr | _id: string, text: string | test | 185375 | 1500 | 102533 | 185375 |
| sv | _id: string, text: string | test | 132949 | 1500 | 83862 | 132949 |
数据文件路径
| 配置名称 | 分割 | 路径 |
|---|---|---|
| bg | test | bg/test-* |
| bn | test | bn/test-* |
| cs | test | cs/test-* |
| da | test | da/test-* |
| de | test | de/test-* |
| en | test | en/test-* |
| fa | test | fa/test-* |
| fi | test | fi/test-* |
| hi | test | hi/test-* |
| it | test | it/test-* |
| nl | test | nl/test-* |
| no | test | no/test-* |
| pt | test | pt/test-* |
| ro | test | ro/test-* |
| sr | test | sr/test-* |
| sv | test | sv/test-* |



