five

final-audio-3

收藏
魔搭社区2025-10-22 更新2025-02-22 收录
下载链接:
https://modelscope.cn/datasets/arvin1988/final-audio-3
下载链接
链接失效反馈
官方服务:
资源简介:
# StepEval-Audio-360 ## Introduction We release StepEval Audio 360 as a new benchmark, which consists of 137 multi-turn Chinese prompts sourced from real users and is designed to evaluate the quality of generated response across the following dimensions: linguistic competence, emotional intelligence, logical reasoning, creative generation, commonsense knowledge, complex instruction adherence, role-playing capabilities, gaming interactions, real-world task handling, memory retention, safety compliance, educational tutoring, and comprehensive understanding. Each set of multi-turn dialogue data follows this format: ``` { "conversation_id": 7, "category": "Language Ability", "conversation": [ { "role": "user", "text": "用四川话介绍一下。", "audio_filename": "conversation_id_7_turn_1.wav" }, { "role": "assistant", "text": "(四川话)四川,简称“川”或“蜀”,省会是成都,位于中国西南部,地处长江上游,以山地、丘陵和平原为主,地形多样。四川是中国重要的经济、文化、科教和交通中心之一,也是西南地区重要的经济中心和全国重要的增长极。四川历史悠久" }, { "role": "user", "text": "莎士比亚。", "audio_filename": "conversation_id_7_turn_2.wav" } ] } ``` ## User Manual * Download the dataset ``` # Make sure you have git-lfs installed (https://git-lfs.com) git lfs install git clone https://huggingface.co/datasets/stepfun-ai/StepEval-Audio-360 cd StepEval-Audio-360 git lfs pull ``` * Decompress audio data ``` mkdir audios tar -xvf audios.tar.gz -C audios ``` * How to use ``` from datasets import load_dataset dataset = load_dataset("stepfun-ai/StepEval-Audio-360") dataset = dataset["test"] for item in dataset: conversation_id = item["conversation_id"] category = item["category"] conversation = item["conversation"] # parse multi-turn dialogue data for turn in conversation: role = turn["role"] text = turn["text"] audio_filename = turn["audio_filename"] # refer to decompressed audio file if audio_filename is not None: print(role, text, audio_filename) else: print(role, text) ```

# StepEval-Audio-360 ## 简介 我们发布StepEval-Audio-360作为全新基准测试集,该数据集包含137条源自真实用户的多轮中文提示词,旨在从以下维度评估生成式回复的质量:语言能力、情感智能、逻辑推理、创意生成、常识知识、复杂指令遵循能力、角色扮演能力、游戏交互能力、现实任务处理能力、记忆留存能力、安全合规性、教育辅导能力以及综合理解能力。每组多轮对话数据均遵循如下格式: json { "conversation_id": 7, "category": "Language Ability", "conversation": [ { "role": "user", "text": "用四川话介绍一下。", "audio_filename": "conversation_id_7_turn_1.wav" }, { "role": "assistant", "text": "(四川话)四川,简称“川”或“蜀”,省会是成都,位于中国西南部,地处长江上游,以山地、丘陵和平原为主,地形多样。四川是中国重要的经济、文化、科教和交通中心之一,也是西南地区重要的经济中心和全国重要的增长极。四川历史悠久" }, { "role": "user", "text": "莎士比亚。", "audio_filename": "conversation_id_7_turn_2.wav" } ] } ## 用户手册 * 数据集下载 # 请确保已安装git-lfs(下载地址:https://git-lfs.com) git lfs install git clone https://huggingface.co/datasets/stepfun-ai/StepEval-Audio-360 cd StepEval-Audio-360 git lfs pull * 音频数据解压 mkdir audios tar -xvf audios.tar.gz -C audios * 使用方法 python from datasets import load_dataset dataset = load_dataset("stepfun-ai/StepEval-Audio-360") dataset = dataset["test"] for item in dataset: conversation_id = item["conversation_id"] category = item["category"] conversation = item["conversation"] # 解析多轮对话数据 for turn in conversation: role = turn["role"] text = turn["text"] audio_filename = turn["audio_filename"] # 对应已解压的音频文件 if audio_filename is not None: print(role, text, audio_filename) else: print(role, text)
提供机构:
maas
创建时间:
2025-02-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作