five

richiejp/aec-challenge-16k

收藏
Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/richiejp/aec-challenge-16k
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - audio-classification tags: - speech - acoustic-echo-cancellation - aec-challenge - icassp-2022 pretty_name: Microsoft AEC Challenge 16kHz (FLAC) --- # Microsoft AEC Challenge 16kHz [Microsoft AEC Challenge](https://github.com/microsoft/AEC-Challenge) dataset converted from 16kHz WAV to **FLAC** (lossless compression) and packed into tar shards. Source: the `datasets/` directory of the microsoft/AEC-Challenge Git LFS repo. Covers all challenge years (2021, ICASSP 2022, ICASSP 2023). ## Structure ### Real recordings Paired loopback (far-end reference) and microphone recordings from real devices. - `real/` — 37,578 files, single playback real recordings - `real_doubled/` — 10,531 files, double playback real recordings Filenames preserve the GUID-based naming convention: `{GUID}_{scenario}_{signal}.flac` Scenarios: `farend_singletalk`, `farend_singletalk_with_movement`, `nearend_singletalk`, `doubletalk`, `doubletalk_with_movement`, `sweep` Signals: `lpb` (loopback/far-end reference), `mic` (microphone recording) ### Synthetic data (10,000 clips) - `synthetic_echo/` — Echo signal component - `synthetic_farend/` — Far-end reference signal - `synthetic_nearend_mic/` — Mixed microphone signal (echo + near-end + noise) - `synthetic_nearend_speech/` — Clean near-end speech - `meta.csv` — Synthetic data metadata ### Test sets - `test_set/` — Original test set (clean + noisy) - `test_set_icassp2022/` — ICASSP 2022 test set - `blind_test_set/` — Original blind test set - `blind_test_set_icassp2022/` — ICASSP 2022 blind test set - `blind_test_set_icassp2023/` — ICASSP 2023 blind test set - `blind_test_set_interspeech2021/` — Interspeech 2021 blind test set ## Usage ```python from huggingface_hub import snapshot_download import tarfile from pathlib import Path # Download local = snapshot_download("richiejp/aec-challenge-16k", local_dir="/data/aec", repo_type="dataset") # Extract all shards for tar_path in sorted(Path(local).rglob("*.tar")): with tarfile.open(tar_path) as tf: tf.extractall(tar_path.parent) ``` ## Source Original data from Microsoft's AEC Challenge: - https://github.com/microsoft/AEC-Challenge - License: CC-BY-4.0 (see original repo for details)
提供机构:
richiejp
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作