faisaltareque/ConSUM_v9_control
收藏Hugging Face2024-05-12 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/faisaltareque/ConSUM_v9_control
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: language
dtype: string
- name: language_code
dtype: string
- name: url
dtype: string
- name: title
dtype: string
- name: summary
dtype: string
- name: text
dtype: string
- name: keyword
dtype: string
- name: english_url
dtype: string
- name: extractiveness
dtype: float64
- name: summary_words_length
dtype: int64
- name: summary_sentences_length
dtype: int64
- name: summary_digit_occurrences
dtype: int64
- name: entities
dtype: string
- name: entity_count
dtype: int64
- name: specificity
dtype: float64
- name: present_entities
dtype: string
- name: keyword_json
dtype: string
- name: controlled_input
dtype: string
- name: controlled_output
dtype: string
splits:
- name: train
num_bytes: 5566259644
num_examples: 480428
- name: val
num_bytes: 184929973
num_examples: 16003
- name: test
num_bytes: 434655192
num_examples: 37389
download_size: 2662503782
dataset_size: 6185844809
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: val
path: data/val-*
- split: test
path: data/test-*
---
提供机构:
faisaltareque
原始信息汇总
数据集概述
数据集特征
- language: 字符串类型
- language_code: 字符串类型
- url: 字符串类型
- title: 字符串类型
- summary: 字符串类型
- text: 字符串类型
- keyword: 字符串类型
- english_url: 字符串类型
- extractiveness: 浮点数类型
- summary_words_length: 整数类型
- summary_sentences_length: 整数类型
- summary_digit_occurrences: 整数类型
- entities: 字符串类型
- entity_count: 整数类型
- specificity: 浮点数类型
- present_entities: 字符串类型
- keyword_json: 字符串类型
- controlled_input: 字符串类型
- controlled_output: 字符串类型
数据集分割
- train: 480428个样本,占用5566259644字节
- val: 16003个样本,占用184929973字节
- test: 37389个样本,占用434655192字节
数据集大小
- 下载大小: 2662503782字节
- 数据集总大小: 6185844809字节
配置文件
- config_name: default
- data_files:
- train: 路径为
data/train-* - val: 路径为
data/val-* - test: 路径为
data/test-*
- train: 路径为



