typhoon-audio-preview-data

Name: typhoon-audio-preview-data
Creator: maas
Published: 2025-08-15 16:32:42
License: 暂无描述

魔搭社区2025-08-15 更新2025-05-24 收录

下载链接：

https://modelscope.cn/datasets/scb10x/typhoon-audio-preview-data

下载链接

链接失效反馈

官方服务：

资源简介：

# Typhoon Audio Preview Data ## Overview - This dataset is for aligning speech/audio representations with textual representations. It consists of {audio, instruction, response} examples in both Thai and English. This repository provides {instruction, response} pairs that we generated for Typhoon-Audio training. We do not own the original data sources (e.g., CommonVoice, LibriSpeech, etc), and you can download these datasets from the original sources, or contact `{potsawee, kunat}@scb10x.com` - Please refer to our technical report for more information about the dataset: https://arxiv.org/abs/2409.10999 ## Data Splits 1. **Pretrained**: 1.8M examples consisting of ASR and Audio Captioning data 2. **SFT**: 665K examples consisting of a range of audio tasks ## Attributes - `path`: path to the local wav file -- please change the directory on your machine. - `instruction`: text instruction (which can be null, i.e., the instruction is in the audio) - `response`: target answer ## Citation If you find this work useful, please consider citing: ``` @article{manakul2024enhancing, title={Enhancing low-resource language and instruction following capabilities of audio language models}, author={Manakul, Potsawee and Sun, Guangzhi and Sirichotedumrong, Warit and Tharnpipitchai, Kasima and Pipatanakul, Kunat}, journal={arXiv preprint arXiv:2409.10999}, year={2024} } ```

# 台风音频预览数据集（Typhoon Audio Preview Data） ## 概述 - 本数据集用于实现语音/音频表征与文本表征的对齐，包含泰语和英语的{音频、指令、回复}样本。本仓库提供了我们为Typhoon-Audio模型训练所生成的{指令、回复}配对样本。我们并不拥有原始数据源（例如CommonVoice、LibriSpeech等），您可从原始来源下载这些数据集，或联系`{potsawee, kunat}@scb10x.com`进行咨询。 - 如需了解该数据集的更多细节，请参阅我们的技术报告：https://arxiv.org/abs/2409.10999 ## 数据划分 1. **预训练集（Pretrained）**：包含180万个样本，涵盖自动语音识别（ASR, Automatic Speech Recognition）与音频字幕任务数据 2. **监督微调集（SFT, Supervised Fine-Tuning）**：包含66.5万个样本，涵盖多种音频相关任务 ## 数据属性 - `path`：本地WAV音频文件路径——请根据您的设备修改对应目录 - `instruction`：文本指令（可为空值，即指令内嵌于音频内容中） - `response`：目标应答内容 ## 引用说明若您的工作使用了本数据集，请引用以下文献： @article{manakul2024enhancing, title={增强音频语言模型的低资源语言与指令跟随能力}, author={Manakul, Potsawee and Sun, Guangzhi and Sirichotedumrong, Warit and Tharnpipitchai, Kasima and Pipatanakul, Kunat}, journal={arXiv preprint arXiv:2409.10999}, year={2024} }

提供机构：

maas

创建时间：

2025-05-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集