lucasvmigotto/articles-g1-links
收藏Hugging Face2026-01-13 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/lucasvmigotto/articles-g1-links
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: year
dtype: large_string
- name: month
dtype: large_string
- name: url
dtype: large_string
- name: lemmas
dtype: large_string
splits:
- name: train
num_bytes: 1077715330
num_examples: 4797042
download_size: 443103362
dataset_size: 1077715330
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
license: mit
task_categories:
- feature-extraction
- text-classification
language:
- pt
pretty_name: G1 Articles links
---
# Articles G1 links
All articles URLs from [G1 news portal](https://g1.globo.com) from January 1st 2003 up to May 30th 2025.
URL extraction completed using the [sitemap](https://g1.globo.com/sitemap/g1/sitemap.xml) declared inside the [robots.txt](https://g1.globo.com/robots.txt)
提供机构:
lucasvmigotto



