charleslwang/easycall-dysarthria
收藏Hugging Face2026-04-01 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/charleslwang/easycall-dysarthria
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-2.0
arxiv: 2601.14046
dataset_info:
features:
- name: audio
dtype: audio
- name: filename
dtype: string
- name: speaker
dtype: string
- name: text
dtype: string
- name: dysarthria_severity
dtype: string
splits:
- name: test
num_bytes: 148684208
num_examples: 5213
- name: train
num_bytes: 750495919
num_examples: 11901
- name: validation
num_bytes: 288578100
num_examples: 4272
download_size: 806733066
dataset_size: 1187758227
configs:
- config_name: default
data_files:
- split: test
path: data/test-*
- split: train
path: data/train-*
- split: validation
path: data/validation-*
language:
- it
tags:
- dysarthria
---
# EasyCall
The EasyCall corpus is a database of command speech recorded from healthy individuals and dysarthric patients.
## Dataset Details
### Dataset Description
This dataset has been collected through a collaboration between the Italian Institute of Technology (IIT), the University of Ferrara and the Sant'Anna Hospital of Ferrara, and it aims at providing a new resource for future developments of ASR-based assistive technologies.
In particular, it may be exploited to develop a voice-controlled contact application for commercial smartphones, and improve dysarthric patients' ability to communicate with their family and caregivers.
- **Curated by:** Rosanna Turrisi, Arianna Braccia, Marco Emanuele, Simone Giulietti, Luciano Fadiga, Mariachiara Sensi, Leonardo Badino.
### License
- CC BY-NC 2.0
### Dataset Sources
- **Repository:** http://neurolab.unife.it/easycallcorpus/
## Dataset Structure
It currently consists of 16683 audio recordings from 21 healthy and 26 dysarthric speakers. For each speech-impaired individual, dysarthria has been assessed by a neurologist through the Therapy Outcome Measure.
The recordings focus on a small vocabulary, including basic smartphone commands, such as “open contacts”, “start call”, “end call”. Specifically, these commands are the result of a survey administered to patients that evaluates which commands are more likely to be employed by dysarthric individuals to use a speech command-based contact application.
In addition, the dataset includes a list of non-commands (i.e., words near/inside commands or phonetically close to commands) that can be leveraged to build a more robust ASR system.
## Citation
If you use this dataset, please cite the following publication:
```
@inproceedings{turrisi21_interspeech,
title = {EasyCall Corpus: A Dysarthric Speech Dataset},
author = {Rosanna Turrisi and Arianna Braccia and Marco Emanuele and Simone Giulietti and Maura Pugliatti and Mariachiara Sensi and Luciano Fadiga and Leonardo Badino},
year = {2021},
booktitle = {Interspeech 2021},
pages = {41--45},
doi = {10.21437/Interspeech.2021-549},
issn = {2958-1796},
}
```
You can use this dataset with our benchmarking toolkit at https://github.com/changelinglab/prism
```
@misc{prism2026,
title={PRiSM: Benchmarking Phone Realization in Speech Models},
author={Shikhar Bharadwaj and Chin-Jou Li and Yoonjae Kim and Kwanghee Choi and Eunjung Yeo and Ryan Soh-Eun Shim and Hanyu Zhou and Brendon Boldt and Karen Rosero Jacome and Kalvin Chang and Darsh Agrawal and Keer Xu and Chao-Han Huck Yang and Jian Zhu and Shinji Watanabe and David R. Mortensen},
year={2026},
eprint={2601.14046},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.14046},
}
```
提供机构:
charleslwang



