eduagarcia/pagico
收藏Hugging Face2024-04-26 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/eduagarcia/pagico
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: corpus
features:
- name: id
dtype: string
- name: xml
dtype: string
- name: text
dtype: string
splits:
- name: corpus
num_bytes: 7398029677
num_examples: 903497
download_size: 2594343622
dataset_size: 7398029677
- config_name: default
features:
- name: query_id
dtype: string
- name: categories
sequence: string
- name: subcatories
sequence: string
- name: region
dtype: string
- name: corpus_id
dtype: string
- name: justification_ids
sequence: string
- name: justification_score
dtype: int64
- name: score
dtype: int64
splits:
- name: test
num_bytes: 4622965
num_examples: 33420
download_size: 1011111
dataset_size: 4622965
- config_name: qrels
features:
- name: query_id
dtype: string
- name: corpus_id
dtype: string
- name: score
dtype: int64
splits:
- name: qrels
num_bytes: 2152002
num_examples: 33098
download_size: 900854
dataset_size: 2152002
- config_name: queries
features:
- name: id
dtype: string
- name: categories
sequence: string
- name: text
dtype: string
splits:
- name: queries
num_bytes: 15516
num_examples: 150
download_size: 9747
dataset_size: 15516
configs:
- config_name: corpus
data_files:
- split: corpus
path: corpus/corpus-*
- config_name: default
data_files:
- split: test
path: data/test-*
- config_name: qrels
data_files:
- split: qrels
path: qrels/qrels-*
- config_name: queries
data_files:
- split: queries
path: queries/queries-*
---
提供机构:
eduagarcia
原始信息汇总
数据集概述
配置名称:corpus
- 特征:
- id: 字符串类型
- xml: 字符串类型
- text: 字符串类型
- 分割:
- corpus:
- 字节数: 7398029677
- 示例数: 903497
- corpus:
- 下载大小: 2594343622
- 数据集大小: 7398029677
配置名称:default
- 特征:
- query_id: 字符串类型
- categories: 字符串序列
- subcategories: 字符串序列
- region: 字符串类型
- corpus_id: 字符串类型
- justification_ids: 字符串序列
- justification_score: 整数类型
- score: 整数类型
- 分割:
- test:
- 字节数: 4622965
- 示例数: 33420
- test:
- 下载大小: 1011111
- 数据集大小: 4622965
配置名称:qrels
- 特征:
- query_id: 字符串类型
- corpus_id: 字符串类型
- score: 整数类型
- 分割:
- qrels:
- 字节数: 2152002
- 示例数: 33098
- qrels:
- 下载大小: 900854
- 数据集大小: 2152002
配置名称:queries
- 特征:
- id: 字符串类型
- categories: 字符串序列
- text: 字符串类型
- 分割:
- queries:
- 字节数: 15516
- 示例数: 150
- queries:
- 下载大小: 9747
- 数据集大小: 15516



