tinisoft/formatted_indic_sharellama

Name: tinisoft/formatted_indic_sharellama
Creator: tinisoft
Published: 2025-01-13 15:24:58
License: 暂无描述

Hugging Face2025-01-13 更新2025-02-15 收录

下载链接：

https://hf-mirror.com/datasets/tinisoft/formatted_indic_sharellama

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含文档ID、语言和文本三个字段，文本字段为字符串类型的序列数据。数据集被划分为训练集，共有317,565个示例，整个数据集大小为1,721,709,093字节。

The dataset includes three fields: document ID, language, and text, with the text field being a sequence of string-type data. The dataset is split into a training set with a total of 317,565 examples, and the entire dataset size is 1,721,709,093 bytes.

提供机构：

tinisoft

5,000+

优质数据集

54 个

任务类型

进入经典数据集