PITTI/speechmap-assessments-v3

Name: PITTI/speechmap-assessments-v3
Creator: PITTI
Published: 2025-11-14 13:22:37
License: 暂无描述

Hugging Face2025-11-14 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/PITTI/speechmap-assessments-v3

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en - zh - fi --- # Speechmap collection Datasets in [this collection](https://huggingface.co/collections/PITTI/speechmap-68626bb47b34b7608d184c4c) are derived from xlr8harder's [Speechmap / llm-compliance](https://github.com/xlr8harder/llm-compliance) project. Data has been indexed slightly differently, some columns have been added and others have been removed. Refer to the original Github repo for the full dataset. The collection includes: - 2.4k questions: [speechmap-questions](https://huggingface.co/datasets/PITTI/speechmap-questions) - 369k responses: [speechmap-responses](https://huggingface.co/datasets/PITTI/speechmap-responses-v3) - 2.07m LLM-judge assessments: [speechmap-assessments](https://huggingface.co/datasets/PITTI/speechmap-assessments-v3) combining the original LLM-assessments from the llm-compliance repo (gpt-4o-2024-11-20), assessments by mistral-small-3.1-24b-instruct-2503 (local, 8bit), mistral-small-3.2-24b-instruct-2506 (local, 8bit), Qwen3-Next-80B-A3B-Instruct (local, 8bit), gemma-3-27b-it (Google API), deepseek-chat-v3.2 (Deepseek API) and manual annotations. | Judge Model | Total Assessments | llm-compliance | pitti_compliance | validated [1] | |:---|---:|---:|---:|---:| | openai/gpt-4o-2024-11-20 | 368925 | 368925 | 0 | 0 | | pitti/pap | 3532 | 3532 | 3532 | 0 | | mistral-small-3.2-24b-instruct-2506-q8 | 368925 | 368925 [2] | 368925 | 323071 | | gemma-3-27b-it | 368925 | 368925 [2] | 368925 | 318039 | | Qwen3-Next-80B-A3B-Instruct-8bit | 368925 | 368925 [2] | 368925 | 318097 | | deepseek-chat-v3.2 | 368925 | 368925 [2] | 368925 | 317762 | | mistral-small-3.1-24b-instruct-2503 | 217125 | 217125 | 0 | 0 | [1] either all LLM judges were unanimous or the classification was validated by a human. The last column of the dataset provides the source of validation [2] these responses were classified under the pitti_compliance classification and mapped back into the speechmap 'compliance' classification as follows: ```json PITTI_SPEECHMAP_MAPPING = { "COMPLETE": "COMPLETE", "REFUSAL": "DENIAL", "REBUTTAL": "DENIAL", "EVASIVE": "EVASIVE", "BLOCKED": "ERROR", "ERROR": "ERROR" } ``` ## If you have used [speechmap-assessments v1](https://huggingface.co/datasets/PITTI/speechmap-assessments) in a project Note that some uuids have been updated. For v1, the download of some responses had failed so the data was scraped from the [Speechmap website](https://speechmap.ai/). After v1 the data was directly sourced from the [llm-compliance repo](https://github.com/xlr8harder/llm-compliance) and new uuids were allocated. The mapping between the old uuids and the new uuids are available [here](https://huggingface.co/datasets/PITTI/speechmap-assessments-v2/blob/main/old_assessments_uuid_mapping.json). ## Original project Make sure to check out the [Speechmap website](https://speechmap.ai/), where you can browse the original dataset in great detail. ## License Note that data in the original llm-compliance repo covers model outputs that may be subject to individual LLM licenses. Annotations and classifications by LLMs judges are published under the same licenses as the original LLMs. Manual annotations and classifications are published under CC-BY 4.0 license

--- 语言： - 英语 - 中文 - 芬兰语 --- # Speechmap数据集集合本集合（[https://huggingface.co/collections/PITTI/speechmap-68626bb47b34b7608d184c4c](https://huggingface.co/collections/PITTI/speechmap-68626bb47b34b7608d184c4c)）中的数据集源自xlr8harder的[Speechmap / llm-compliance](https://github.com/xlr8harder/llm-compliance)项目。本次数据集的索引方式略有调整，新增了部分字段，同时移除了部分原有字段。完整数据集请参阅原始GitHub仓库。本集合包含以下内容： - 2400条问题：[speechmap-questions](https://huggingface.co/datasets/PITTI/speechmap-questions) - 36.9万条回复：[speechmap-responses-v3](https://huggingface.co/datasets/PITTI/speechmap-responses-v3) - 207万条大语言模型（Large Language Model，LLM）评判评估数据：[speechmap-assessments-v3](https://huggingface.co/datasets/PITTI/speechmap-assessments-v3)。该数据结合了llm-compliance仓库中的原始大语言模型评估数据（使用gpt-4o-2024-11-20生成），以及mistral-small-3.1-24b-instruct-2503（本地部署，8bit量化）、mistral-small-3.2-24b-instruct-2506（本地部署，8bit量化）、Qwen3-Next-80B-A3B-Instruct（本地部署，8bit量化）、gemma-3-27b-it（通过Google API调用）、deepseek-chat-v3.2（通过Deepseek API调用）生成的评估数据，同时包含人工标注数据。 | 评判模型 | 总评估数 | llm-compliance | pitti_compliance | 已验证[1] | |:---|---:|---:|---:|---:| | openai/gpt-4o-2024-11-20 | 368925 | 368925 | 0 | 0 | | pitti/pap | 3532 | 3532 | 3532 | 0 | | mistral-small-3.2-24b-instruct-2506-q8 | 368925 | 368925 [2] | 368925 | 323071 | | gemma-3-27b-it | 368925 | 368925 [2] | 368925 | 318039 | | Qwen3-Next-80B-A3B-Instruct-8bit | 368925 | 368925 [2] | 368925 | 318097 | | deepseek-chat-v3.2 | 368925 | 368925 [2] | 368925 | 317762 | | mistral-small-3.1-24b-instruct-2503 | 217125 | 217125 | 0 | 0 | [1] 指所有大语言模型评判器达成一致，或该分类已通过人工验证。数据集的最后一列提供了验证来源。 [2] 这些回复按照pitti_compliance分类标准进行分类，并按以下映射关系回溯至Speechmap的'compliance'分类体系： json PITTI_SPEECHMAP_MAPPING = { "COMPLETE": "COMPLETE", "REFUSAL": "DENIAL", "REBUTTAL": "DENIAL", "EVASIVE": "EVASIVE", "BLOCKED": "ERROR", "ERROR": "ERROR" } ## 针对使用过speechmap-assessments v1的项目的说明若您在项目中使用过[speechmap-assessments v1](https://huggingface.co/datasets/PITTI/speechmap-assessments)，请注意部分UUID已更新。v1版本中，部分回复的下载失败，因此数据是从[Speechmap官网](https://speechmap.ai/)爬取得到的。v1版本之后，数据直接源自llm-compliance仓库，并重新分配了新的UUID。新旧UUID的映射关系可参阅[此处](https://huggingface.co/datasets/PITTI/speechmap-assessments-v2/blob/main/old_assessments_uuid_mapping.json)。 ## 原始项目请务必访问[Speechmap官网](https://speechmap.ai/)，您可在其中详细浏览原始数据集。 ## 许可证请注意，原始llm-compliance仓库中的模型输出数据可能受各大语言模型自身许可证约束。大语言模型评判器生成的标注与分类结果的发布许可证与对应原始大语言模型的许可证一致。人工标注与分类结果采用CC-BY 4.0许可证发布。

提供机构：

PITTI

5,000+

优质数据集

54 个

任务类型

进入经典数据集