OpenLLM-France/wikisource
收藏Hugging Face2025-01-03 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/OpenLLM-France/wikisource
下载链接
链接失效反馈官方服务:
资源简介:
这是一个法语维基源文本数据集,包含了来自wikisource.org的页面文本,去除了HTML标签和wiki模板,仅包含Markdown语法。数据集由LINAGORA和OpenLLM France创建,基于Wikimedia的数据转存,使用Creative Commons Attribution-ShareAlike 4.0国际许可证发布。数据集共有185,700个文档,约523亿3106万4千9百个单词,数据大小为1.9G。
This dataset is a plain text version of pages from wikisource.org in French language, stripped of HTML tags and wiki templates, containing only Markdown syntax. It was created by LINAGORA and OpenLLM France from Wikimedia dumps and is released under the Creative Commons Attribution-ShareAlike 4.0 International License. The dataset includes 185,700 documents, approximately 523,310,649 words, with a total size of 1.9G.
提供机构:
OpenLLM-France



