MohamedZayton/AMINA
收藏AMINA Newspaper Articles Dataset
概述
AMINA : An Arabic Multi-Purpose Integral News Articles Dataset 是一个从多个知名新闻源收集的综合性文章数据集。该数据集旨在促进自然语言处理和新闻学研究等领域的发展。文章来源包括:
- Youm7
- BBC
- CNN
- RT
- Elsharq
- ElRai
- Elspahe
- Hespress
下载指南
使用以下代码片段可以下载和使用各来源的文章数据:
python from datasets import load_dataset
BBC articles
bbc = load_dataset("MohamedZayton/AMINA", data_files="BBC/BBC.csv")
CNN articles
cnn = load_dataset("MohamedZayton/AMINA", data_files="CNN/CNN.csv")
RT articles
rt = load_dataset("MohamedZayton/AMINA", data_files="RT/RT.csv")
Youm7 articles
youm_7 = load_dataset("MohamedZayton/AMINA", data_files="Youm7/Youm7.csv")
Hespress articles
hespress = load_dataset("MohamedZayton/AMINA", data_files="Hespress/Hespress.csv")
Elspahe articles
elspahe = load_dataset("MohamedZayton/AMINA", data_files="Elspahe/Elspahe.csv")
ElRai articles by category
elrai_category_name = load_dataset("MohamedZayton/AMINA", data_files="ElRai/{category_name}.csv")
ElSharq articles by category
elsharq_category_name = load_dataset("MohamedZayton/AMINA", data_files="ElSharq/{category_name}.csv")
图片链接
部分Youm7和Elsharq报纸文章的图片链接:图片链接



