amyguan/newswire-50-60-civil-rights
收藏Hugging Face2024-12-08 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/amyguan/newswire-50-60-civil-rights
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含新闻文章的相关信息,字段包括文章内容、作者、日期、报纸元数据(如LCCN编号、报纸所在城市、州和标题)、多个主题分类(如反垄断、民权、犯罪、政府监管、劳工运动、政治、抗议等)、命名实体识别(NER)的词语和标签、地理位置信息(如城市、州、国家、坐标)、提及的人物信息(如性别、姓名、职业、维基数据ID)等。数据集包含一个训练集,大小为52457545.41623776字节,包含10525个样本。
This dataset contains news articles and their associated metadata, including the article text, byline, dates, newspaper metadata (such as the city, state, and title of the newspaper), and multiple topic labels related to the article (e.g., antitrust, civil rights, crime). Additionally, the dataset includes Named Entity Recognition (NER) information, geolocation information, mentioned individuals and their related information (such as gender, occupation, Wikidata ID), cluster size, and year information. The dataset is divided into a training set, containing 10525 samples.
提供机构:
amyguan



