Calvin-Xu/Furigana-Aozora
收藏Hugging Face2024-07-28 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/Calvin-Xu/Furigana-Aozora
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为振り仮名注釈コーパス(青空文库コーパス),源自青空文库及Sapee的点字数据,专注于振假名注释,适用于文本到文本生成任务。在验证过程中,纠正了307处不匹配的实例。数据集为日语,规模在1M到10M之间,采用MIT许可证,标签为教育和振假名相关应用。
The dataset, named Furigana Annotation Corpus (Aozora Bunko Corpus), is derived from Aozora Bunko and Sapees braille data, focusing on furigana annotations and intended for text-to-text generation tasks. During validation, 307 mismatched instances were corrected. The dataset is in Japanese, with a size ranging from 1M to 10M entries, licensed under MIT, and tagged for education and furigana-related applications.
提供机构:
Calvin-Xu



