tinisoft/formatted_wiki_conv
收藏Hugging Face2025-01-13 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/tinisoft/formatted_wiki_conv
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文档ID、语言类型和文本内容三个字段。它被划分为一个训练集,共有2,121,525个示例,数据集总大小为5,675,919,483字节。提供了默认配置,用于指定训练数据的文件路径。
The dataset includes three fields: document ID, language type, and text content. It is split into a training set with a total of 2,121,525 examples, and the total size of the dataset is 5,675,919,483 bytes. A default configuration is provided to specify the file path for the training data.
提供机构:
tinisoft



