ai4bharat/Shrutilipi
收藏Hugging Face2025-03-07 更新2025-04-08 收录
下载链接:
https://hf-mirror.com/datasets/ai4bharat/Shrutilipi
下载链接
链接失效反馈官方服务:
资源简介:
Shrutilipi是一个包含12种印度语言的标注自动语音识别(ASR)语料库,通过挖掘全印度电台新闻简报中的音频和文本对而构建。该语料库支持的语言包括孟加拉语、古吉拉特语、印地语、卡纳达语、马来语、马拉地语、奥里亚语、旁遮普语、梵语、泰米尔语、泰卢固语和乌尔都语。它包含了超过6400小时的数据,并以CC BY 4.0许可证发布。
Shrutilipi is an annotated automatic speech recognition (ASR) corpus for 12 Indian languages, constructed by mining audio and text pairs from All India Radio news bulletins. The supported languages include Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Sanskrit, Tamil, Telugu, and Urdu. It contains over 6400 hours of data and is released under the CC BY 4.0 license.
提供机构:
ai4bharat



