five

Multitask-National-Speech-Corpus-v1

收藏
魔搭社区2025-12-08 更新2025-03-15 收录
下载链接:
https://modelscope.cn/datasets/pengzhendong/Multitask-National-Speech-Corpus-v1
下载链接
链接失效反馈
官方服务:
资源简介:
Multitask-National-Speech-Corpus (MNSC v1) is derived from [IMDA's NSC Corpus](https://www.imda.gov.sg/how-we-can-help/national-speech-corpus). MNSC is a multitask speech understanding dataset derived and further annotated from IMDA NSC Corpus. It focuses on the knowledge of Singapore's local accent, localised terms, and code-switching. - ASR: Automatic Speech Recognition - SQA: Speech Question Answering - SDS: Spoken Dialogue Summarization - PQA: Paralinguistic Question Answering ``` from datasets import load_dataset data = load_dataset('MERaLiON/Multitask-National-Speech-Corpus-v1', data_dir='ASR-PART1-Train')['train'] ``` ``` @article{wang2025advancing, title={Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal Models}, author={Wang, Bin and Zou, Xunlong and Sun, Shuo and Zhang, Wenyu and He, Yingxu and Liu, Zhuohan and Wei, Chengwei and Chen, Nancy F and Aw, AiTi}, journal={arXiv preprint arXiv:2501.01034}, year={2025} } ```

多任务国家语音语料库(Multitask-National-Speech-Corpus,简称MNSC v1)源自[新加坡资讯通信媒体发展局(Infocomm Media Development Authority,简称IMDA)的国家语音语料库(IMDA's NSC Corpus)](https://www.imda.gov.sg/how-we-can-help/national-speech-corpus)。 MNSC是从IMDA国家语音语料库衍生并经进一步标注得到的多任务语音理解数据集,其研究重点涵盖新加坡本土口音、本地化术语及语码转换相关内容。 - ASR:自动语音识别(Automatic Speech Recognition) - SQA:语音问答(Speech Question Answering) - SDS:口语对话摘要(Spoken Dialogue Summarization) - PQA:副语言问答(Paralinguistic Question Answering) from datasets import load_dataset data = load_dataset('MERaLiON/Multitask-National-Speech-Corpus-v1', data_dir='ASR-PART1-Train')['train'] @article{wang2025advancing, title={Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal Models}, author={Wang, Bin and Zou, Xunlong and Sun, Shuo and Zhang, Wenyu and He, Yingxu and Liu, Zhuohan and Wei, Chengwei and Chen, Nancy F and Aw, AiTi}, journal={arXiv preprint arXiv:2501.01034}, year={2025} }
提供机构:
maas
创建时间:
2025-03-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作