Aratako/LiquidAI-Hackathon-Tokyo-CPT-Data
收藏Hugging Face2025-10-12 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/Aratako/LiquidAI-Hackathon-Tokyo-CPT-Data
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含四个配置:emilia-en、emilia-ja、wikipedia-en和wikipedia-ja。每个配置都详细说明了特征(input_ids、labels、attention_mask),训练部分中的示例数量和字节数、下载大小和数据集大小。该数据集用于自动语音识别、文本转语音和音频转音频等任务。包含的语言有日语(ja)和英语(en)。数据集遵循cc-by-nc-4.0许可。根据提供的信息,没有具体描述数据集的内容或用途,只提供了结构和技术细节。
The dataset consists of four configurations: emilia-en, emilia-ja, wikipedia-en, and wikipedia-ja. Each configuration details the features (input_ids, labels, attention_mask), the number of examples and bytes in the train split, the download size, and the dataset size. The dataset is used for tasks such as automatic-speech-recognition, text-to-speech, and audio-to-audio. The languages included are Japanese (ja) and English (en). The dataset is licensed under cc-by-nc-4.0. Based on the information provided, there is no specific description of the datasets content or purpose, only structural and technical details are given.
提供机构:
Aratako



