pnsahoo/20-30-civil-rights-embedding
收藏Hugging Face2024-12-08 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/pnsahoo/20-30-civil-rights-embedding
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个字段,主要涉及新闻报道的相关信息,如文章内容、作者、日期、报纸元数据(包括图书馆控制号、城市、州、标题)、多个主题标签(如反垄断、民权、犯罪、政府监管、劳工运动、政治、抗议等)、命名实体识别(NER)的词汇和标签、电讯城市、州、国家、坐标、位置注释、提到的人物信息(包括性别、姓名、职业、维基数据ID)、聚类大小、年份以及嵌入向量。数据集分为训练集,包含8805个样本,总大小为91188300字节。
This dataset includes multiple fields primarily related to news articles, such as article content, byline, dates, newspaper metadata (including LCCN, city, state, title), various topic labels (such as antitrust, civil rights, crime, government regulation, labor movement, politics, protests, etc.), named entity recognition (NER) words and labels, wire city, state, country, coordinates, location notes, mentioned people information (including gender, name, occupation, Wikidata ID), cluster size, year, and embedding vectors. The dataset is divided into a training set containing 8805 samples, with a total size of 91188300 bytes.
提供机构:
pnsahoo



