g2p-exploration/fineweb-edu-tiny-misaki
收藏Hugging Face2025-02-28 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/g2p-exploration/fineweb-edu-tiny-misaki
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含输入文本和与之相关的多个输出属性,如文本标签、空白字符信息、是否为头部、别名、音位、重音(可能为空)、货币类型、数字标记、前导空格和评分等。训练集包含超过56万个样本,数据集总大小为约933MB。具体应用场景和数据集目的未在README中说明。
The dataset includes input text and multiple related output attributes such as text label, whitespace information, whether it is a head, alias, phonemes, stress (possibly null), currency type, number flags, prespace, and rating. The training set contains over 562,000 samples, and the total size of the dataset is approximately 933MB. The specific application scenario and purpose of the dataset are not described in the README.
提供机构:
g2p-exploration



