akahana/wikipedia-id
收藏Hugging Face2024-07-10 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/akahana/wikipedia-id
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含印度尼西亚语(id)的文本数据,主要特征包括id、url、title和text。数据集被分为训练集,包含648,383个样本,总大小为1,086,885,042字节。这些数据可能用于自然语言处理任务,如文本分类、信息检索或语言模型训练。
This dataset contains text data in Indonesian (id), with main features including id, url, title, and text. The dataset is divided into a training set containing 648,383 samples, with a total size of 1,086,885,042 bytes. This data may be used for natural language processing tasks such as text classification, information retrieval, or language model training.
提供机构:
akahana



