five

typhoon-audio-preview-data

收藏
魔搭社区2025-08-15 更新2025-05-24 收录
下载链接:
https://modelscope.cn/datasets/scb10x/typhoon-audio-preview-data
下载链接
链接失效反馈
官方服务:
资源简介:
# Typhoon Audio Preview Data ## Overview - This dataset is for aligning speech/audio representations with textual representations. It consists of {audio, instruction, response} examples in both Thai and English. This repository provides {instruction, response} pairs that we generated for Typhoon-Audio training. We do not own the original data sources (e.g., CommonVoice, LibriSpeech, etc), and you can download these datasets from the original sources, or contact `{potsawee, kunat}@scb10x.com` - Please refer to our technical report for more information about the dataset: https://arxiv.org/abs/2409.10999 ## Data Splits 1. **Pretrained**: 1.8M examples consisting of ASR and Audio Captioning data 2. **SFT**: 665K examples consisting of a range of audio tasks ## Attributes - `path`: path to the local wav file -- please change the directory on your machine. - `instruction`: text instruction (which can be null, i.e., the instruction is in the audio) - `response`: target answer ## Citation If you find this work useful, please consider citing: ``` @article{manakul2024enhancing, title={Enhancing low-resource language and instruction following capabilities of audio language models}, author={Manakul, Potsawee and Sun, Guangzhi and Sirichotedumrong, Warit and Tharnpipitchai, Kasima and Pipatanakul, Kunat}, journal={arXiv preprint arXiv:2409.10999}, year={2024} } ```

# 台风音频预览数据集(Typhoon Audio Preview Data) ## 概述 - 本数据集用于实现语音/音频表征与文本表征的对齐,包含泰语和英语的{音频、指令、回复}样本。本仓库提供了我们为Typhoon-Audio模型训练所生成的{指令、回复}配对样本。我们并不拥有原始数据源(例如CommonVoice、LibriSpeech等),您可从原始来源下载这些数据集,或联系`{potsawee, kunat}@scb10x.com`进行咨询。 - 如需了解该数据集的更多细节,请参阅我们的技术报告:https://arxiv.org/abs/2409.10999 ## 数据划分 1. **预训练集(Pretrained)**:包含180万个样本,涵盖自动语音识别(ASR, Automatic Speech Recognition)与音频字幕任务数据 2. **监督微调集(SFT, Supervised Fine-Tuning)**:包含66.5万个样本,涵盖多种音频相关任务 ## 数据属性 - `path`:本地WAV音频文件路径——请根据您的设备修改对应目录 - `instruction`:文本指令(可为空值,即指令内嵌于音频内容中) - `response`:目标应答内容 ## 引用说明 若您的工作使用了本数据集,请引用以下文献: @article{manakul2024enhancing, title={增强音频语言模型的低资源语言与指令跟随能力}, author={Manakul, Potsawee and Sun, Guangzhi and Sirichotedumrong, Warit and Tharnpipitchai, Kasima and Pipatanakul, Kunat}, journal={arXiv preprint arXiv:2409.10999}, year={2024} }
提供机构:
maas
创建时间:
2025-05-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作