syvai/nst-ftspeech-common-kanade25hz
收藏Hugging Face2026-02-12 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/syvai/nst-ftspeech-common-kanade25hz
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: text
dtype: string
- name: audio_tokens
sequence:
dtype: int64
- name: global_embedding
sequence:
dtype: float64
- name: speaker_id
dtype: string
- name: duration
dtype: float64
- name: source
dtype: string
splits:
- name: train
num_examples: 1181228
configs:
- config_name: default
data_files:
- split: train
path: data/combined.parquet
license: cc-by-4.0
language:
- da
---
# NST + FTSpeech - Kanade 25Hz Tokenized
Danish speech dataset tokenized with [Kanade](https://huggingface.co/frothywater/kanade-25hz-clean) (25 tokens/sec, 12800 codebook).
## Sources
- **alexandrainst/nst-da**: 236738 samples (315.5h)
- **alexandrainst/ftspeech**: 944490 samples (1391.2h)
- **Total**: 1181228 samples (1706.7h)
## Columns
| Column | Description |
|--------|-------------|
| `text` | Original text |
| `audio_tokens` | Kanade content token indices (25 tokens/sec, vocab 12800) |
| `global_embedding` | 128-dim speaker/style embedding |
| `speaker_id` | Speaker identifier |
| `duration` | Audio duration in seconds |
| `source` | Dataset origin (`nst-da` or `ftspeech`) |
## Processing
- Audio filtered to 1-15s duration
- Resampled to 24kHz before Kanade encoding
- Tokens clipped to `int((duration + 0.3) * 25)` to remove artifacts
提供机构:
syvai



