tsch00001/wikipedia-pl-clean

Name: tsch00001/wikipedia-pl-clean
Creator: tsch00001
Published: 2025-02-28 21:22:43
License: 暂无描述

Hugging Face2025-02-28 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/tsch00001/wikipedia-pl-clean

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含文本数据和对应的词汇计数。它被划分为一个训练集，包含约1587406个样本，总大小约为2.5GB。数据集适用于需要文本分析和处理的任务。

The dataset includes text data and corresponding token counts. It is split into a training set with approximately 1,587,406 examples, totaling about 2.5GB in size. The dataset is suitable for tasks that require text analysis and processing.

提供机构：

tsch00001

5,000+

优质数据集

54 个

任务类型

进入经典数据集