five

Cyro1/enwiki_pageviews_2023_m

收藏
Hugging Face2026-01-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Cyro1/enwiki_pageviews_2023_m
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - other - text-classification language: - en size_categories: - 1M<n<10M tags: - wikipedia - pageviews - popularity - knowledge-graph - bias-detection --- # English Wikipedia Pageviews 2023 (Monthly Average) This dataset links Wikipedia article IDs (from the 1st September 2019 Wikipedia dump) to their average monthly pageviews recorded during 2023. ## Features | Column | Type | Description | |--------|------|-------------| | `wikipedia_id` | int64 | Wikipedia article ID (page_id) | | `wikipedia_title` | string | Wikipedia article title | | `popularity_avg` | float64 | Average monthly pageviews across 2023 | | `rank_avg` | float64 | Average rank of the article | ## Stats - **Articles**: 6.4M Wikipedia articles - **Source**: Wikimedia pageview dumps for 2023 - **Split**: 90% train / 10% test (seed=42) ## Usage ```python from datasets import load_dataset ds = load_dataset("Cyro1/enwiki_pageviews_2021_m") df = ds["train"].to_pandas() # Merge with your dataset merged = your_df.merge( df[["wikipedia_id", "monthly_avg_pageviews"]], left_on="document_id", right_on="wikipedia_id" ) ```
提供机构:
Cyro1
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作