if001/wikimedia_ja_short
收藏Hugging Face2026-04-26 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/if001/wikimedia_ja_short
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含1,384,986个训练示例,总大小约为7.02 GB,下载大小约为3.64 GB。每个示例具有四个特征:id(字符串类型,唯一标识符)、url(字符串类型,来源链接)、title(字符串类型,标题)和text(字符串类型,文本内容)。数据集仅提供train分割,数据文件路径为data/train-*。由于README未提供具体描述,无法确定数据集的特定用途或领域,但根据特征推测可能用于文本分析或自然语言处理任务。
This dataset contains 1,384,986 training examples with a total size of approximately 7.02 GB and a download size of approximately 3.64 GB. Each example includes four features: id (string type, unique identifier), url (string type, source link), title (string type, title), and text (string type, text content). The dataset only provides a train split, with data files located at data/train-*. As the README does not include a specific description, the exact purpose or domain of the dataset cannot be determined, but based on the features, it is likely intended for text analysis or natural language processing tasks.
提供机构:
if001



