chonkie-ai/wikipedia-100k
收藏Hugging Face2025-01-08 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/chonkie-ai/wikipedia-100k
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含四个字段:唯一标识符(id),网页链接(url),标题(title)和正文(text)。数据集的训练集部分包含10万个例子,总大小约为478.75MB。数据集的下载大小约为279.96MB。数据集提供了一个默认配置,用于指定训练集的数据文件。
The dataset includes four fields: unique identifier (id), web page URL (url), title (title), and body (text). The training set of the dataset contains 100,000 examples and is approximately 478.75MB in size. The download size of the dataset is approximately 279.96MB. The dataset provides a default configuration for specifying the data files of the training set.
提供机构:
chonkie-ai



