five

changelinglab/voxangeles-pr

收藏
Hugging Face2026-02-27 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/changelinglab/voxangeles-pr
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 configs: - config_name: default data_files: - split: test path: data/test-* dataset_info: features: - name: id dtype: string - name: audio dtype: audio - name: transcript dtype: string splits: - name: test num_bytes: 181911955 num_examples: 5445 download_size: 181463324 dataset_size: 181911955 task_categories: - automatic-speech-recognition tags: - phone-recognition - low-resource-languages pretty_name: VoxAngeles Corpus --- # VoxAngeles > [!NOTE] > This dataset is licensed under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/). > If you use this dataset, please remember to cite: > ``` > @inproceedings{chodroff2024voxangeles, % curated the dataset title={Phonetic Segmentation of the UCLA Phonetics Lab Archive}, author={Chodroff, Eleanor and Pa{\v{z}}on, Bla{\v{z}} and Baker, Annie and Moran, Steven}, booktitle={Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)}, pages={12724--12733}, year={2024} } > @inproceedings{zhu2025zipa, % processed this subset title = "{ZIPA}: A family of efficient models for multilingual phone recognition", author = "Zhu, Jian and Samir, Farhan and Chodroff, Eleanor and Mortensen, David R.", booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = jul, year = "2025", address = "Vienna, Austria", publisher = "Association for Computational Linguistics", doi = "10.18653/v1/2025.acl-long.961", } > ``` [VoxAngeles Corpus](https://github.com/pacscilab/voxangeles) is a corpus of audited phonetic transcriptions, phone-level alignments and phonetic measurements of the [UCLA Phonetics Lab Archive](http://archive.phonetics.ucla.edu/), containing data from 95 languages. We only provide paired speech and phonetic transcription here, while `id[:3]` is the ISO-639-3 language code of the utterance. This dataset is included as `pr-vox` in our phone recognition benchmarking toolkit, [💎PRiSM](https://github.com/changelinglab/prism). ``` @misc{prism2026, title={PRiSM: Benchmarking Phone Realization in Speech Models}, author={Shikhar Bharadwaj and Chin-Jou Li and Yoonjae Kim and Kwanghee Choi and Eunjung Yeo and Ryan Soh-Eun Shim and Hanyu Zhou and Brendon Boldt and Karen Rosero Jacome and Kalvin Chang and Darsh Agrawal and Keer Xu and Chao-Han Huck Yang and Jian Zhu and Shinji Watanabe and David R. Mortensen}, year={2026}, eprint={2601.14046}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2601.14046}, } ```
提供机构:
changelinglab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作