five

charleslwang/easycall-dysarthria

收藏
Hugging Face2026-04-01 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/charleslwang/easycall-dysarthria
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-2.0 arxiv: 2601.14046 dataset_info: features: - name: audio dtype: audio - name: filename dtype: string - name: speaker dtype: string - name: text dtype: string - name: dysarthria_severity dtype: string splits: - name: test num_bytes: 148684208 num_examples: 5213 - name: train num_bytes: 750495919 num_examples: 11901 - name: validation num_bytes: 288578100 num_examples: 4272 download_size: 806733066 dataset_size: 1187758227 configs: - config_name: default data_files: - split: test path: data/test-* - split: train path: data/train-* - split: validation path: data/validation-* language: - it tags: - dysarthria --- # EasyCall The EasyCall corpus is a database of command speech recorded from healthy individuals and dysarthric patients. ## Dataset Details ### Dataset Description This dataset has been collected through a collaboration between the Italian Institute of Technology (IIT), the University of Ferrara and the Sant'Anna Hospital of Ferrara, and it aims at providing a new resource for future developments of ASR-based assistive technologies. In particular, it may be exploited to develop a voice-controlled contact application for commercial smartphones, and improve dysarthric patients' ability to communicate with their family and caregivers. - **Curated by:** Rosanna Turrisi, Arianna Braccia, Marco Emanuele, Simone Giulietti, Luciano Fadiga, Mariachiara Sensi, Leonardo Badino. ### License - CC BY-NC 2.0 ### Dataset Sources - **Repository:** http://neurolab.unife.it/easycallcorpus/ ## Dataset Structure It currently consists of 16683 audio recordings from 21 healthy and 26 dysarthric speakers. For each speech-impaired individual, dysarthria has been assessed by a neurologist through the Therapy Outcome Measure. The recordings focus on a small vocabulary, including basic smartphone commands, such as “open contacts”, “start call”, “end call”. Specifically, these commands are the result of a survey administered to patients that evaluates which commands are more likely to be employed by dysarthric individuals to use a speech command-based contact application. In addition, the dataset includes a list of non-commands (i.e., words near/inside commands or phonetically close to commands) that can be leveraged to build a more robust ASR system. ## Citation If you use this dataset, please cite the following publication: ``` @inproceedings{turrisi21_interspeech, title = {EasyCall Corpus: A Dysarthric Speech Dataset}, author = {Rosanna Turrisi and Arianna Braccia and Marco Emanuele and Simone Giulietti and Maura Pugliatti and Mariachiara Sensi and Luciano Fadiga and Leonardo Badino}, year = {2021}, booktitle = {Interspeech 2021}, pages = {41--45}, doi = {10.21437/Interspeech.2021-549}, issn = {2958-1796}, } ``` You can use this dataset with our benchmarking toolkit at https://github.com/changelinglab/prism ``` @misc{prism2026, title={PRiSM: Benchmarking Phone Realization in Speech Models}, author={Shikhar Bharadwaj and Chin-Jou Li and Yoonjae Kim and Kwanghee Choi and Eunjung Yeo and Ryan Soh-Eun Shim and Hanyu Zhou and Brendon Boldt and Karen Rosero Jacome and Kalvin Chang and Darsh Agrawal and Keer Xu and Chao-Han Huck Yang and Jian Zhu and Shinji Watanabe and David R. Mortensen}, year={2026}, eprint={2601.14046}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2601.14046}, } ```
提供机构:
charleslwang
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作