changelinglab/thchs30-segment
收藏Hugging Face2026-04-12 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/changelinglab/thchs30-segment
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- zh
pretty_name: THCHS-30 Segment
task_categories:
- automatic-speech-recognition
tags:
- speech
- phone-alignment
- segmentation
- mandarin
size_categories:
- 10K<n<100K
---
# THCHS-30 Segment
Mandarin Chinese read-speech corpus with **phone-level time alignments**.
Suitable for training and evaluating phone recognition and phonetic
segmentation models.
## Sources
- **Audio**: [THCHS-30](https://www.openslr.org/18/) (OpenSLR 18) by
Dong Wang, Xuewei Zhang, Zhiyong Zhang (Tsinghua University, 2015).
- **Phone alignments**:
[`anyspeech/THCHS-30-alignments`](https://huggingface.co/datasets/anyspeech/THCHS-30-alignments).
## Splits
| Split | Utterances |
|-------|------------|
| train | 10,000 |
| val | 893 |
| test | 2,495 |
Splits follow the original OpenSLR 18 directory partition
(`data_thchs30/{train,dev,test}`); `dev` is renamed to `val`.
## Schema
| Column | Type | Description |
|----------------|----------------------|------------------------------------------------------|
| `utt_id` | string | Utterance id, e.g. `A11_0` |
| `audio` | Audio(16 kHz) | Embedded waveform bytes (decoded on access) |
| `text` | string | Hanzi sentence transcript |
| `phones` | sequence[string] | IPA phone tokens with tone diacritics |
| `phone_starts` | sequence[float64] | Phone start times in seconds |
| `phone_ends` | sequence[float64] | Phone end times in seconds |
| `language` | string | `cmn` (ISO 639-3) |
| `speaker_id` | string | Speaker code (utt_id prefix, e.g. `A11`) |
| `duration` | float64 | Utterance duration in seconds |
| `split` | string | `train` / `val` / `test` |
## Phone inventory
Phones are IPA with Mandarin tone diacritics, e.g.
`lː`, `y˥˩`, `ʂ˘`, `ɻ̩˥˩`, `a˧˥˘`. Silence and pauses are marked with
`[SIL]` intervals, which are kept in the alignment so boundary models can
learn from them.
## License
Released under the **Apache 2.0** license, matching the original THCHS-30
release.
## Citation
```bibtex
@misc{THCHS30_2015,
title={THCHS-30 : A Free Chinese Speech Corpus},
author={Dong Wang, Xuewei Zhang, Zhiyong Zhang},
year={2015},
url={http://arxiv.org/abs/1512.01882}
}
```
提供机构:
changelinglab



