nitaibezerra/govbrnews
收藏Hugging Face2024-12-20 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/nitaibezerra/govbrnews
下载链接
链接失效反馈官方服务:
资源简介:
GovBR新闻数据集是通过自动抓取巴西政府机构在gov.br域名下发布的新闻构建的。该数据集包含新闻的元数据,如标题、发布日期、类别、标签、URL和内容。数据集由巴西公共服务和创新部(MGI)维护,并定期更新以包含最新新闻。数据集提供了结构化数据和CSV格式,便于不同工具和场景下的使用。
The GovBR News Dataset is a dataset formed by automatically scraping the latest news published by government agencies under the gov.br domain. This dataset includes news and their metadata, such as title, publication date, category, tags, original URL, and content. The dataset is maintained by the Brazilian Ministry of Management and Innovation in Public Services (MGI) as part of an experimental effort to centralize and structure government information. The dataset contains structured fields such as unique identifier, publishing agency, publication date, title, URL, category, tags, content, and extraction date. Additionally, the dataset provides files in CSV format for use in other tools and environments. The dataset is regularly updated through an automated scraping, deduplication, and sorting process and is directly published on the Hugging Face platform.
提供机构:
nitaibezerra



