optimization-hashira/hindi-normalized
收藏Hugging Face2025-02-07 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/optimization-hashira/hindi-normalized
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含原始文本和经过标准化的文本两个特征,适合进行文本处理相关的任务。数据集分为训练集、测试集和验证集三个部分,分别包含27962、3496和3495个样本。数据集的总大小为394163401字节,下载大小为147677801字节。
The dataset includes two features: raw text and normalized text, which are suitable for text processing tasks. It is divided into three parts: training set, test set, and validation set, containing 27,962, 3,496, and 3,495 samples respectively. The total size of the dataset is 394,163,401 bytes, and the download size is 147,677,801 bytes.
提供机构:
optimization-hashira



