OdiaGenAIdata/fine_web2_odia_pt
收藏Hugging Face2024-12-11 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/OdiaGenAIdata/fine_web2_odia_pt
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个特征,如文本、ID、URL、日期、文件路径、语言、语言分数、语言脚本、minhash聚类大小和主要语言等。数据集分为一个训练集,包含1,158,595个样本,总大小为5,156,179,730字节。下载大小为1,922,191,467字节。
The dataset includes multiple features such as text, ID, URL, date, file path, language, language score, language script, minhash cluster size, and top languages. The dataset is divided into a training set containing 1,158,595 samples, with a total size of 5,156,179,730 bytes. The download size is 1,922,191,467 bytes.
提供机构:
OdiaGenAIdata



