five

i4ds/spc_r_segmented

收藏
Hugging Face2026-02-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/i4ds/spc_r_segmented
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: id dtype: string - name: duration dtype: float64 - name: audio dtype: audio - name: text dtype: string language: - de task_categories: - automatic-speech-recognition source_datasets: - i4ds/spc_r license: cc-by-4.0 --- # i4ds/spc_r_segmented Diarized and segmented speech dataset derived from [i4ds/spc_r](https://huggingface.co/datasets/i4ds/spc_r). ## Description Each row is a merged speech segment belonging to a single speaker. The source audio and SRT subtitles from `i4ds/spc_r` were processed with the following pipeline: 1. **Diarization** -- [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) assigned speaker labels to each SRT segment based on temporal overlap. 2. **Merging** -- Consecutive SRT segments from the same speaker were merged when the silence gap between them was below a threshold (default 1.0s) and the resulting duration stayed within bounds (default 10--20s). 3. **Slicing** -- The merged time ranges were used to slice the original audio waveform. Each segment is encoded as FLAC. ## Columns | Column | Type | Description | |------------|---------|--------------------------------------------------| | `id` | string | Unique identifier (`row{NNNNN}_seg{NNN}`) | | `duration` | float64 | Segment duration in seconds | | `audio` | audio | FLAC audio for the segment | | `text` | string | Merged transcript text from the SRT segments | ## Usage ```python from datasets import load_dataset ds = load_dataset("i4ds/spc_r_segmented") print(ds["train"][0]) ```
提供机构:
i4ds
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作