talon-community/PanDomain-V1.5
收藏Hugging Face2025-04-29 更新2025-11-29 收录
下载链接:
https://hf-mirror.com/datasets/talon-community/PanDomain-V1.5
下载链接
链接失效反馈官方服务:
资源简介:
---
license: gpl-3.0
---
# **PanDomain-V1.5**
**PanDomain-V1.5** builds on **PanDomain-V1.3**, adding key updates across science, math, code, and writing. This version focuses on improving science, especially in medicine, and enhances the math dataset by adding some new tasks without reasoning blocks. For coding, we’ve added 1k rows focused on UI-related samples.
---
### **Dataset Breakdown**
| **Domain** | **Size** | **Details** |
|--------------|-------------|--------------------------------------------|
| **Code** | 7.5k | Includes general code tasks with 1k rows added for UI design samples |
| **Safety** | 4k | Includes 1k for handling ambiguous, vague, and broad prompts, and 1k for handling "How are you?" questions, plus the original 2k safety tasks |
| **Writing** | 6k | Includes longform and structured writing tasks |
| **Math** | 8k | Added some math tasks without reasoning blocks |
| **Science** | 5k | Added 2.5k rows focused on medicine and core science fields |
---
**Total Dataset Size: 31.5k**
---
许可证:GPL-3.0
---
# **PanDomain-V1.5**
**PanDomain-V1.5** 基于 **PanDomain-V1.3** 迭代优化,在科学、数学、代码与写作四大领域新增关键更新内容。本版本着重优化科学领域(尤其是医学方向),并通过新增若干无推理模块的数学任务对数学数据集进行增强。在代码领域,我们新增了1000条聚焦用户界面(UI)相关样本的条目。
---
### **数据集细分**
| **领域** | **规模** | **详情** |
|--------------|-------------|--------------------------------------------|
| **代码** | 7.5k | 包含通用代码任务,新增1000条UI设计相关样本 |
| **安全** | 4k | 包含1000条用于处理歧义、模糊与宽泛提示的任务,1000条用于处理"How are you?"类问候问题的任务,外加原有2000条安全任务 |
| **写作** | 6k | 包含长文本与结构化写作任务 |
| **数学** | 8k | 新增若干无推理模块的数学任务 |
| **科学** | 5k | 新增2500条聚焦医学与核心科学领域的条目 |
---
**总数据集规模:31.5k**
提供机构:
talon-community



