amyguan/newswire-50-70
收藏Hugging Face2024-11-04 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/amyguan/newswire-50-70
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含新闻文章的相关信息,特征包括文章内容、作者、日期、报纸元数据(如报纸标题、城市、州等)、多个主题标签(如反垄断、民权、犯罪等)、命名实体识别(NER)相关的词汇和标签、地理位置信息、提及的人物信息(包括性别、姓名、职业等)以及年份信息。数据集仅包含一个训练集,共有876,015个样本,总大小为4,406,157,497.08字节。
This dataset contains information related to news articles, with features including article content, byline, dates, newspaper metadata (such as newspaper title, city, state, etc.), multiple topic labels (such as antitrust, civil rights, crime, etc.), named entity recognition (NER) related words and labels, geographical information, mentioned person information (including gender, name, occupation, etc.), and year information. The dataset contains only a training set with 876,015 samples and a total size of 4,406,157,497.08 bytes.
提供机构:
amyguan



