five

Korean Human Preference Dataset

收藏
Snowflake2025-12-10 更新2025-12-11 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZTHZCDT1L
下载链接
链接失效反馈
官方服务:
资源简介:
**Overview**<br/>The Human Preference Dataset is a curated Korean-language dataset designed for preference modeling, reward model training, and alignment research. It contains human-annotated preference pairs and evaluations across major knowledge domains. <p><br/></p> **Dataset Composition**<br/>A total of **20,000 Korean-language preference samples**, distributed across three major domain categories: - **Law (Civil, Criminal, Public)** - *6,000 samples*<br/>Includes human preference judgments on legal reasoning, statutory interpretation, case analysis, and domain-specific scenarios. - **STEM (Science, Mathematics, Engineering, Technology)** - *8,000 samples*<br/>Covers problem-solving steps, explanations, reasoning quality, and correctness assessments. - **Economics, Politics, Social Sciences** - *6,000 samples*<br/>Provides preference evaluations on analytical responses, argument quality, policy reasoning, and social context understanding. <p><br/></p> **Key Features** - Fully human-annotated Korean preference pairs - Designed for reward modeling and alignment tasks - Suitable for use in supervised fine-tuning, RM training, or offline RLHF pipelines - Cleaned, deduplicated, and formatted for direct model training <p><br/></p> **Format**<br/>Provided in standard JSON structure typically used for preference datasets.
提供机构:
Flitto
创建时间:
2025-12-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作