khursanirevo/multiturn_ks_-yIVMNm2ZQU
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/khursanirevo/multiturn_ks_-yIVMNm2ZQU
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- automatic-speech-recognition
- speaker-diarization
language:
- en
- ms
- zh
- ru
- id
- ar
- ja
- ko
multilinguality:
- highly_multilingual
size_categories:
- n<1K
---
# khursanirevo/multiturn_ks_-yIVMNm2ZQU
## Dataset Description
Multiturn dialogue dataset with speaker-separated stereo audio and multi-language transcripts.
### Features
- **Audio**: Stereo audio with speaker separation (speaker 0 = left channel, speaker 1 = right channel)
- **Segments**: Speaker turn-level annotations with timestamps
- **Multi-language**: Transcripts in 9 languages (en, ms, zh-Hans, zh-Hant, ru, id, ar, ja, ko)
- **Chunking**: 30-second chunks with 0.5s overlap
### Columns
- `audio`: Playable stereo audio (24kHz)
- `sentence`: Full transcript for the chunk (English)
- `segments`: JSON list of speaker turns with speaker, start, end, text fields
- `total_speakers`: Number of speakers in chunk (typically 2)
- `sentence_ms`, `sentence_en`, etc.: Transcripts in each language
### Usage
```python
from datasets import load_dataset
dataset = load_dataset("khursanirevo/multiturn_ks_-yIVMNm2ZQU")
# Access audio and segments
chunk = dataset[0]
audio = chunk["audio"] # Stereo audio array
segments = json.loads(chunk["segments"]) # Speaker turns
for seg in segments:
speaker = seg['speaker']
text = seg['text']
print(f"Speaker {speaker}: {text}")
```
### Speaker Detection
Speakers are detected using RMS energy analysis:
- Channel 0 (left): Speaker 0
- Channel 1 (right): Speaker 1
### Languages
Supported languages:
- English (en)
- Malay (ms)
- Chinese Simplified (zh-Hans)
- Chinese Traditional (zh-Hant)
- Russian (ru)
- Indonesian (id)
- Arabic (ar)
- Japanese (ja)
- Korean (ko)
## Dataset Statistics
- Total chunks: 204
- Max chunk duration: 30.0s
- Overlap: 0.5s
- Languages: en, ms
## Source
Created from YouTube video with dialogue separation using DialogueSidon model.
## License
CC-BY-4.0
提供机构:
khursanirevo



