WhissleAI/omni-400k-with-meta
收藏Hugging Face2025-10-25 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/WhissleAI/omni-400k-with-meta
下载链接
链接失效反馈官方服务:
资源简介:
VoiceAssistant-400K数据集包含470,054个问答对和音频录音,旨在用于语音助手训练和研究。数据集大小约为162GB,下载大小约为219GB,许可证为Apache 2.0。数据集包含以下特征:split_name(数据集分割标识符)、index(唯一示例标识符)、round(对话轮次编号)、question(问题的文本转录)、question_audio(问题的音频录音,16kHz WAV格式)、answer(文本响应/答案)、answer_snac(SNAC编码的答案表示)。数据集处理流程包括音频特征提取、人口统计和情感分析、实体和意图标注以及批量处理架构。最终输出模式包括注解的问题和答案文本、音频持续时间和源索引。数据集可以使用提供的Python代码加载和处理,并可以根据人口统计、情感和意图等元数据进行过滤。
The VoiceAssistant-400K dataset contains 470,054 question-answer pairs with audio recordings, designed for voice assistant training and research. The dataset is approximately 162 GB in size and requires a download size of around 219 GB. It is licensed under Apache 2.0. The dataset features include split_name (dataset split identifier), index (unique example identifier), round (conversation round number), question (text transcription of the question), question_audio (audio recording of the question in 16kHz WAV format), answer (text response/answer), and answer_snac (SNAC-encoded answer representation). The data processing pipeline involves audio feature extraction, demographic and emotional analysis, entity and intent annotation, and batch processing architecture. The final output schema includes annotated question and answer texts, audio duration, and source index. The dataset can be loaded and processed using the provided Python code and can be filtered based on metadata such as age group, emotion, and intent.
提供机构:
WhissleAI



