EQ-bench_es

Name: EQ-bench_es
Creator: maas
Published: 2025-12-05 16:38:32
License: 暂无描述

魔搭社区2025-12-05 更新2025-06-21 收录

下载链接：

https://modelscope.cn/datasets/BSC-LT/EQ-bench_es

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for EQ Bench Dataset (Spanish Version) This dataset card documents the Spanish adaptation of the EQ-Bench benchmark. The original dataset was designed to evaluate emotional reasoning in language models through dialogue-based prompts. ## Dataset Details ### Dataset Description EQ-Bench (Spanish Version) is a translated and linguistically adapted version of the original EQ-Bench dataset. Its design responds to the need to adapt the emotional detection capabilities of multilingual models, recognizing that the expression and perception of emotions varies significantly across languages. Key adaptations include: 1) The conversion of adjectival emotion labels into nominal forms to resolve gender agreement ambiguity. 2) The unification of semantically equivalent labels that appeared in different grammatical forms in the original dataset (e.g., *pride/proud* → *orgullo*). 3) The replacement of Anglo-Saxon proper names with culturally appropriate Spanish ones to maintain linguistic coherence. - **Curated by:** Barcelona Supercomputing Center (BSC) - **Funded by:** [AINA](https://projecteaina.cat/) ; [ILENIA](https://proyectoilenia.es/) - **Language(s) (NLP):** es (Spanish) - **License:** CC BY 4.0 ### Dataset Sources - **Repository:** [EQ-Bench](https://huggingface.co/datasets/pbevan11/EQ-Bench) - **Paper:** Paech, S. J. (2023). *EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models*. [arXiv:2312.06281](https://arxiv.org/abs/2312.06281) ## Uses ### Direct Use This dataset can be used to: - Evaluate emotional reasoning in Spanish-language LLMs - Study multilingual performance variation in emotion understanding - Fine-tune or test classification models for emotion recognition from dialogue ### Out-of-Scope Use This dataset is not intended for: - Training general-purpose sentiment analysis systems without considering the emotion-specific context of dialogues. - Applications such as real-time mental health diagnostics or therapeutic interventions. - Generating outputs in legal or clinical contexts without human oversight. - Use in non-Spanish contexts, as the dataset has been culturally and linguistically localized. ## Dataset Structure ### Dataset Structure Each entry in the dataset follows this structure: ```json { "{ "prompt": "...", "reference_answer": { // The full prompt for the model, including the dialogue, emotion options, and formatting instructions "emotion1": "...", // First emotion label "emotion2": "...", // Second emotion label "emotion3": "...", // Third emotion label "emotion4": "...", // Fourth emotion label "emotion1_score": int, // Annotated scores "emotion2_score": int, "emotion3_score": int, "emotion4_score": int }, "reference_answer_fullscale": { "emotion1": "...", //Same emotion labels "emotion2": "...", "emotion3": "...", "emotion4": "...", "emotion1_score": int, // Resolution scores "emotion2_score": int, "emotion3_score": int, "emotion4_score": int } } } ``` ## Dataset Creation ### Curation Rationale The design of this dataset responds to the need to adapt the emotional detection capabilities of multilingual models, recognizing that the expression and perception of emotions varies significantly across languages. ### Source Data EQ Bench original dataset: (https://huggingface.co/datasets/pbevan11/EQ-Bench) #### Who are the source data producers? All credits go to the creator of the original EQ Bench dataset, Samuel J. Paech. ## Citation **APA:** Paech, S. J. (2023). EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models. arXiv. https://arxiv.org/abs/2312.06281 ## More information This work/research has been promoted and financed by the Government of Catalonia through the [Aina project](https://projecteaina.cat/). This work is funded by the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the [project ILENIA](https://proyectoilenia.es/) with reference 2022/TL22/00215337 ## Contact point Language Technologies Unit (langtech@bsc.es) at the Barcelona Supercomputing Center (BSC).

# EQ基准数据集（西班牙语版）数据集卡片本数据集卡片记录了EQ-Bench基准测试的西班牙语适配版本。原始数据集旨在通过基于对话的提示词，评估大语言模型（Large Language Model）的情感推理能力。 ## 数据集详情 ### 数据集描述 EQ-Bench西班牙语版是原始EQ-Bench数据集的翻译与语言适配版本。其设计初衷是适配多语言模型的情感检测能力，鉴于不同语言在情感表达与感知层面存在显著差异。关键适配项包括： 1) 将形容词性情感标签转换为名词形式，以消除性别一致歧义。 2) 统一原始数据集中以不同语法形式出现的语义等价标签（例如：*pride/proud* → *orgullo*）。 3) 将盎格鲁-撒克逊专有名词替换为符合西班牙文化语境的名称，以保障语言连贯性。 - **编纂方：** 巴塞罗那超级计算中心（Barcelona Supercomputing Center，BSC） - **资助方：** [AINA](https://projecteaina.cat/)；[ILENIA](https://proyectoilenia.es/) - **（自然语言处理）语言：** 西班牙语（es） - **许可证：** CC BY 4.0 ### 数据集来源 - **代码仓库：** [EQ-Bench](https://huggingface.co/datasets/pbevan11/EQ-Bench) - **相关论文：** Paech, S. J. (2023). *EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models*. [arXiv:2312.06281](https://arxiv.org/abs/2312.06281) ## 应用场景 ### 直接可用场景本数据集可用于： - 评估西班牙语大语言模型的情感推理能力 - 研究多语言模型在情感理解上的性能差异 - 微调或测试基于对话的情感识别分类模型 ### 不适用场景本数据集不应用于以下场景： - 在未考虑对话特定情感上下文的情况下，训练通用情感分析系统。 - 实时心理健康诊断或治疗干预类应用。 - 未经人工审核的法律或临床场景内容生成。 - 非西班牙语语境，因本数据集已完成文化与语言本地化适配。 ## 数据集结构 ### 数据集结构数据集中的每条条目遵循如下格式： json { "{ "prompt": "...", "reference_answer": { // 模型的完整提示词，包含对话、情感选项与格式说明 "emotion1": "...", // 第一个情感标签 "emotion2": "...", // 第二个情感标签 "emotion3": "...", // 第三个情感标签 "emotion4": "...", // 第四个情感标签 "emotion1_score": int, // 人工标注得分 "emotion2_score": int, "emotion3_score": int, "emotion4_score": int }, "reference_answer_fullscale": { "emotion1": "...", // 同上述情感标签 "emotion2": "...", "emotion3": "...", "emotion4": "...", "emotion1_score": int, // 全量评分 "emotion2_score": int, "emotion3_score": int, "emotion4_score": int } } } ## 数据集构建 ### 编纂初衷本数据集的设计初衷是适配多语言模型的情感检测能力，鉴于不同语言在情感表达与感知层面存在显著差异。 ### 源数据 EQ基准原始数据集：(https://huggingface.co/datasets/pbevan11/EQ-Bench) #### 源数据生产者是谁？所有荣誉归于原始EQ基准数据集的创作者Samuel J. Paech。 ## 引用格式 **APA格式：** Paech, S. J. (2023). EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models. arXiv. https://arxiv.org/abs/2312.06281 ## 更多信息本研究由加泰罗尼亚政府通过[Aina项目](https://projecteaina.cat/)推广并资助。本工作由西班牙数字化转型与公共职能部资助，属于欧盟下一代欧盟（NextGenerationEU）框架下的[ILENIA项目](https://proyectoilenia.es/)，项目编号为2022/TL22/00215337。 ## 联系方式巴塞罗那超级计算中心（BSC）语言技术部门（langtech@bsc.es）。

提供机构：

maas

创建时间：

2025-06-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集