institutogaia/wikipedia_dumps_pt
收藏Hugging Face2025-04-23 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/institutogaia/wikipedia_dumps_pt
下载链接
链接失效反馈官方服务:
资源简介:
葡萄牙语维基百科数据集包含了从葡萄牙语维基百科中提取的文章,是一个全面的葡萄牙语文章集合,适用于自然语言处理任务、信息检索和知识提取。数据集包含了1,918,865篇文章,总字数为1,049,750,961字。
This dataset contains articles extracted from the Portuguese Wikipedia dump, providing a comprehensive collection of Wikipedia articles in Portuguese, valuable for natural language processing tasks, information retrieval, and knowledge extraction. The dataset includes 1,918,865 articles with a total word count of 1,049,750,961 words.
提供机构:
institutogaia



