Blinorot/ALARM-Corpora

Name: Blinorot/ALARM-Corpora
Creator: Blinorot
Published: 2026-03-11 21:39:53
License: 暂无描述

Hugging Face2026-03-11 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/Blinorot/ALARM-Corpora

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: audio_description dtype: string - name: unique_id_1 dtype: string - name: unique_id_2 dtype: string - name: dataset_index dtype: int64 - name: context dtype: string - name: llm_answer_with_context dtype: string - name: llm_answer_no_context dtype: string - name: dataset_name dtype: string splits: - name: raw_train num_bytes: 12059336672 num_examples: 6002682 - name: raw_validation num_bytes: 23524235 num_examples: 4158 - name: train num_bytes: 10995612354 num_examples: 5447863 - name: validation num_bytes: 1067155915 num_examples: 554842 download_size: 10539229221 dataset_size: 24145629176 configs: - config_name: default data_files: - split: raw_train path: data/raw_train-* - split: raw_validation path: data/raw_validation-* - split: train path: data/train-* - split: validation path: data/validation-* tags: - audio - audio-understanding - LLM - RLM size_categories: - 1M<n<10M --- # Dataset Card for ALARM-Corpora ## Dataset Summary This is the dataset used in the [ALARM: Audio-Language Alignment for Reasoning Models](https://arxiv.org/abs/2603.09556) paper. It consists of Audio Captions and Reasoning Language Model responses rephrased to sound like they were provided by an audio-understanding model. For more details regarding the dataset and the instructions for obtaining audio files, please refer to our [GitHub](https://github.com/Blinorot/ALARM). ## Dataset Statistics | Audio Type | # Elements (M) | # Hours (K) | # Unique Prompts (M) | | ----------- | ------------------ | -------------------- | -------------------- | | Speech | 2.91 / 2.60 / 0.29 | 9.88 / 8.83 / 0.98 | 1.40 / 1.27 / 0.16 | | Sound | 2.01 / 1.80 / 0.20 | 5.54 / 4.98 / 0.55 | 0.36 / 0.33 / 0.06 | | Music | 0.59 / 0.53 / 0.06 | 2.45 / 2.21 / 0.24 | 0.16 / 0.14 / 0.03 | | Instruction | 0.57 / 0.56 / 0.01 | 1.02 / 1.01 / 0.01 | 0.57 / 0.56 / 0.01 | | **Total** | 6.08 / 5.49 / 0.56 | 18.89 / 17.03 / 1.78 | 2.49 / 2.30 / 0.26 | ## Citation If you use this work, please cite: ```bibtex @article{grinberg2026alarm, title={ALARM: Audio-Language Alignment for Reasoning Models}, author={Grinberg, Petr and Shahmohammadi, Hassan}, journal={arXiv preprint arXiv:2603.09556}, year={2026} } ``` ## License All captions and metadata preserve the licenses of the original datasets. Our added LLM responses are provided under CC BY-NC 4.0.

提供机构：

Blinorot

5,000+

优质数据集

54 个

任务类型

进入经典数据集