S2S-Arena
收藏魔搭社区2025-12-05 更新2025-03-15 收录
下载链接:
https://modelscope.cn/datasets/FreedomIntelligence/S2S-Arena
下载链接
链接失效反馈官方服务:
资源简介:
# S2S-Arena Dataset
This repository hosts the **S2S-Arena** dataset. It covers four practical domains with 21 tasks, includes 154 instructions of varying difficulty levels, and features a mix of samples from TTS synthesis, human recordings, and existing audio datasets.
[Project Page](https://huggingface.co/spaces/FreedomIntelligence/S2S-Arena)
## Introduction
### GitHub Repository
For more information and access to the dataset, please visit the GitHub repository:
[S2S-Arena on GitHub](https://github.com/FreedomIntelligence/S2S-Arena)
### Related Publication
For detailed insights into the dataset’s construction, methodology, and applications, please refer to the accompanying academic publication: [S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information](https://huggingface.co/papers/2503.05085)
## Data Description
The dataset includes labeled audio files, textual emotion annotations, language translations, and task-specific metadata, supporting fine-grained analysis and application in machine learning. Each entry follows this format:
```json
{
"id": "emotion_audio_0",
"input_path": "./emotion/audio_0.wav",
"text": "[emotion: happy]Kids are talking by the door",
"task": "Emotion recognition and expression",
"task_description": "Can the model recognize emotions and provide appropriate responses based on different emotions?",
"text_cn": "孩子们在门旁说话",
"language": "English",
"category": "Social Companionship",
"level": "L3"
}
```
1. id: Unique identifier for each sample
2. input_path: Path to the audio file
3. text: English text with emotion annotation
4. task: Primary task associated with the data
5. task_description: Task description for model interpretability
6. text_cn: Chinese translation of the English text
7. language: Language of the input
8. category: Interaction context category
9. level: Difficulty or complexity level of the sample
"Some data also includes a `noise` attribute, indicating that noise has been added to the current sample and specifying the type of noise."
## BIb
```
@misc{jiang2025s2sarenaevaluatingspeech2speechprotocols,
title={S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information},
author={Feng Jiang and Zhiyu Lin and Fan Bu and Yuhao Du and Benyou Wang and Haizhou Li},
year={2025},
eprint={2503.05085},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.05085},
}
```
# S2S-Arena 数据集
本仓库托管**S2S-Arena**数据集。该数据集涵盖4个实用领域,包含21项任务与154条难度各异的指令,样本来源涵盖文本到语音(Text-to-Speech, TTS)合成音频、人类录制音频以及现有公开音频数据集。
[项目主页](https://huggingface.co/spaces/FreedomIntelligence/S2S-Arena)
## 引言
### GitHub 仓库
如需获取更多数据集相关信息与下载权限,请访问其GitHub仓库:[S2S-Arena 仓库](https://github.com/FreedomIntelligence/S2S-Arena)
### 相关学术论文
如需深入了解该数据集的构建流程、技术方法与应用场景,请参阅配套学术论文:[S2S-Arena:基于副语言信息的语音到语音指令遵循协议评估](https://huggingface.co/papers/2503.05085)
## 数据集说明
本数据集包含带标注的音频文件、文本情感标注、语言译文以及任务专属元数据,可支撑机器学习领域的细粒度分析与相关应用。每条数据条目遵循如下格式:
json
{
"id": "emotion_audio_0",
"input_path": "./emotion/audio_0.wav",
"text": "[emotion: happy]Kids are talking by the door",
"task": "情感识别与表达",
"task_description": "模型能否识别情感并基于不同情感生成恰当的回应?",
"text_cn": "孩子们在门旁说话",
"language": "英语",
"category": "社交陪伴",
"level": "L3"
}
1. `id`:每条样本的唯一标识符
2. `input_path`:音频文件路径
3. `text`:带情感标注的英文文本
4. `task`:该样本关联的核心任务
5. `task_description`:用于提升模型可解释性的任务说明
6. `text_cn`:英文文本的中文译文
7. `language`:输入音频的语言
8. `category`:交互上下文类别
9. `level`:样本的难度/复杂度等级
部分数据还包含`noise`(噪声)属性,用于标注当前样本已添加的噪声类型。
## BibTeX 参考文献
@misc{jiang2025s2sarenaevaluatingspeech2speechprotocols,
title={S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information},
author={Feng Jiang and Zhiyu Lin and Fan Bu and Yuhao Du and Benyou Wang and Haizhou Li},
year={2025},
eprint={2503.05085},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.05085},
}
提供机构:
maas
创建时间:
2025-03-11



