SIGINT/njuse-bda-news-category
收藏Hugging Face2023-11-30 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/SIGINT/njuse-bda-news-category
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: link
dtype: string
- name: headline
dtype: string
- name: label
dtype:
class_label:
names:
'0': ARTS
'1': ARTS & CULTURE
'2': BLACK VOICES
'3': BUSINESS
'4': COLLEGE
'5': COMEDY
'6': CRIME
'7': CULTURE & ARTS
'8': DIVORCE
'9': EDUCATION
'10': ENTERTAINMENT
'11': ENVIRONMENT
'12': FIFTY
'13': FOOD & DRINK
'14': GOOD NEWS
'15': GREEN
'16': HEALTHY LIVING
'17': HOME & LIVING
'18': IMPACT
'19': LATINO VOICES
'20': MEDIA
'21': MONEY
'22': PARENTING
'23': PARENTS
'24': POLITICS
'25': QUEER VOICES
'26': RELIGION
'27': SCIENCE
'28': SPORTS
'29': STYLE
'30': STYLE & BEAUTY
'31': TASTE
'32': TECH
'33': THE WORLDPOST
'34': TRAVEL
'35': U.S. NEWS
'36': WEDDINGS
'37': WEIRD NEWS
'38': WELLNESS
'39': WOMEN
'40': WORLD NEWS
'41': WORLDPOST
- name: short_description
dtype: string
- name: authors
dtype: string
- name: date
dtype: timestamp[s]
splits:
- name: train
num_bytes: 55380192.25283857
num_examples: 167126
- name: test
num_bytes: 13845213.74716143
num_examples: 41782
download_size: 44805575
dataset_size: 69225406.0
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
---
提供机构:
SIGINT
原始信息汇总
数据集概述
特征信息
- link: 数据类型为字符串。
- headline: 数据类型为字符串。
- label: 数据类型为分类标签,包含以下类别:
- 0: ARTS
- 1: ARTS & CULTURE
- 2: BLACK VOICES
- 3: BUSINESS
- 4: COLLEGE
- 5: COMEDY
- 6: CRIME
- 7: CULTURE & ARTS
- 8: DIVORCE
- 9: EDUCATION
- 10: ENTERTAINMENT
- 11: ENVIRONMENT
- 12: FIFTY
- 13: FOOD & DRINK
- 14: GOOD NEWS
- 15: GREEN
- 16: HEALTHY LIVING
- 17: HOME & LIVING
- 18: IMPACT
- 19: LATINO VOICES
- 20: MEDIA
- 21: MONEY
- 22: PARENTING
- 23: PARENTS
- 24: POLITICS
- 25: QUEER VOICES
- 26: RELIGION
- 27: SCIENCE
- 28: SPORTS
- 29: STYLE
- 30: STYLE & BEAUTY
- 31: TASTE
- 32: TECH
- 33: THE WORLDPOST
- 34: TRAVEL
- 35: U.S. NEWS
- 36: WEDDINGS
- 37: WEIRD NEWS
- 38: WELLNESS
- 39: WOMEN
- 40: WORLD NEWS
- 41: WORLDPOST
- short_description: 数据类型为字符串。
- authors: 数据类型为字符串。
- date: 数据类型为时间戳(秒)。
数据分割
- train: 包含167126个样本,总字节数为55380192.25283857。
- test: 包含41782个样本,总字节数为13845213.74716143。
数据集大小
- 下载大小: 44805575字节。
- 数据集大小: 69225406.0字节。
配置信息
- config_name: default
- data_files:
- train: 路径为
data/train-* - test: 路径为
data/test-*
- train: 路径为
- data_files:



