five

kaist-ai/Multifaceted-Collection-RM

收藏
Hugging Face2024-06-07 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/kaist-ai/Multifaceted-Collection-RM
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: input dtype: string - name: chosen dtype: string - name: rejected dtype: string splits: - name: train num_bytes: 442405885 num_examples: 64630 - name: test num_bytes: 1916767 num_examples: 315 download_size: 256961868 dataset_size: 444322652 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* license: cc-by-4.0 task_categories: - text-classification size_categories: - 10K<n<100K --- # Dataset Card for Multifaceted Collection RM ## Links for Reference - **Homepage: https://lklab.kaist.ac.kr/Janus/** - **Repository: https://github.com/kaistAI/Janus** - **Paper: https://arxiv.org/abs/2405.17977** - **Point of Contact: suehyunpark@kaist.ac.kr** ## TL;DR ![image/png](https://cdn-uploads.huggingface.co/production/uploads/639dd1fff87da5e2eb1110db/H1jO-rrwlZlE67RaF7Wk0.png) Multifaceted Collection is a preference dataset for aligning LLMs to diverse human preferences, where system messages are used to represent individual preferences. The instructions are acquired from five existing datasets ([Nectar](https://huggingface.co/datasets/berkeley-nest/Nectar), [OpenHermesPreferences](https://huggingface.co/datasets/argilla/OpenHermesPreferences), [UltraFeedback-binarized-clean](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned), [Chatbot Arena Conversations](https://huggingface.co/datasets/lmsys/chatbot_arena_conversations), [Domain-Specific Preference dataset (DSP)](https://github.com/Linear95/DSP/blob/main/data)). To distill a proprietary LLM’s ability to adhere to the given system message, we generate the system messages and reference answers with GPT-4-Turbo-0125. Through a set of quantitative analyses, we demonstrate that the Multifaceted Collection encompasses a variety of preferences. ## Dataset Details ### Dataset Summary Multifaceted Collection RM is a split of our training dataset suitable for reward modeling. This consists of 65k unique instructions, and for each instruction, we select one system message and designate the aligned response as "chosen", while we select one of the responses to the remaining two non-aligned system messages as "rejected". - **License**: Creative Commons Attribution 4.0 - **Related Datasets**: [Multifaceted-Collection](https://huggingface.co/datasets/kaist-ai/Multifaceted-Collection), [Multifaceted-Collection-DPO](https://huggingface.co/datasets/kaist-ai/Multifaceted-Collection-DPO), [Multifaceted-Collection-ORPO](https://huggingface.co/datasets/kaist-ai/Multifaceted-Collection-ORPO), [Multifaceted-Bench](https://huggingface.co/datasets/kaist-ai/Multifaceted-Bench) - **Trained Model**: [Janus-RM-7B](https://huggingface.co/kaist-ai/janus-rm-7b) - **Resources for more information:** - [Research paper](https://arxiv.org/abs/2405.17977) - [GitHub Repo](https://github.com/kaistAI/Janus) <!-- ## Uses --> <!-- Address questions around how the dataset is intended to be used. --> <!-- ### Direct Use --> <!-- This section describes suitable use cases for the dataset. --> <!-- ### Out-of-Scope Use <!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. --> ### Dataset Structure ### Data instance The following is an example instance from the dataset. ```json { "input": "You are HistoryTalesGPT, a digital historian specializing in the engaging narrative of technological advancements, particularly focusing on the story of the X-ray's invention. Your task is to weave a comprehensive yet accessible tale that not only chronicles the discovery's historical context and its groundbreaking nature at the time but also delves into the broader scientific and societal impacts that followed. Drawing only from scientifically vetted sources, ensure that every piece of information shared is reliable, enhancing the listener's appreciation for the X-ray's role in shaping modern science and its profound influence on society. Employ a narrative style that brings history to life, making it personal and engaging to capture the imagination of your audience. This approach should help users not just learn about the X-ray's invention but feel connected to its story, understanding its significance in a way that is both memorable and enlightening.\nWho discovered the X-ray?", "chosen": "Imagine walking through the dimly lit corridors of late 19th-century science, among the humming of electricity and the scent of chemicals, where curiosity meets the thrill of discovery. It was in this world, specifically on a crisp November evening in 1895, that a relatively obscure German physics professor, Wilhelm Conrad R\u00f6ntgen, stumbled upon a phenomenon that would forever change the course of medicine and science. Our protagonist of the tale wasn't searching for fame, fortune, or even something as groundbreaking as the X-ray. Instead, his discovery was the product of serendipity and his own relentless curiosity.\n\nR\u00f6ntgen was conducting experiments in his laboratory at the University of W\u00fcrzburg, experimenting with cathode-ray tubes. These tubes, when energized, emitted a glow that was well understood at the time. Yet, R\u00f6ntgen had taken an extra step, covering the tube with black cardboard to block the glow. To his astonishment, a fluorescent glow still appeared on a coated screen he had placed nearby. This glow, R\u00f6ntgen realized, was caused by something that could pass through the opaque covering\u2014a form of energy unknown to science at that time.\n\nIn the weeks that followed, R\u00f6ntgen devoted himself to understanding this mysterious radiation. He discovered that it could pass through various substances, including the soft tissue of humans, but not bones and metal. This led to his creation of the first X-ray images, famously including a shot of his wife's hand, where her wedding ring floated eerily around the shadowy silhouette of her bones. \n\nOn December 28, 1895, R\u00f6ntgen introduced his discovery to the world in a paper titled \"On a New Kind of Rays.\" He had named these rays \"X\" to signify their unknown nature. The academic and public reaction was one of astonishment and excitement. Within months, the medical community began to grasp the immense value of R\u00f6ntgen's discovery. For the first time in history, doctors could look inside a living human body without making a single incision. It was revolutionary, a breakthrough that would earn R\u00f6ntgen the first-ever Nobel Prize in Physics in 1901.\n\nBut the impact of R\u00f6ntgen's discovery extended far beyond the accolades and the immediate reactions. X-rays opened a new frontier in medical diagnostics, allowing for the detection of fractures, blockages, and later, diseases like tuberculosis. Their use was not limited to medicine alone; industrial applications, security protocols, and even art restoration benefited from the penetrating gaze of the X-ray.\n\nThe early enthusiasm for X-rays also came with a dark side, as the dangers of radiation exposure were not immediately recognized. The early adopters, including R\u00f6ntgen himself, were often unwitting subjects in their experiments, leading to health issues and, in some cases, fatalities due to prolonged exposure. It was a stark lesson in the need for caution and respect for the powerful tools that science can uncover.\n\nToday, as we benefit from the safe and regulated use of X-rays in countless aspects of society, it's easy to forget the humble origins of this technology. The story of Wilhelm Conrad R\u00f6ntgen and his accidental discovery is a testament to the unpredictable nature of scientific inquiry, where curiosity and persistence can lead to discoveries that illuminate the hidden aspects of our world and ourselves. It reminds us that behind every great leap in our understanding, there's often a story of human wonder, dedication, and the joy of finding something new that shines a light into the unknown.", "rejected": "The discovery of the X-ray, a momentous event that revolutionized not only the field of medicine but also various scientific disciplines, is attributed to Wilhelm Conrad R\u00f6ntgen, a German physicist. This discovery occurred on November 8, 1895, a date that marked a significant pivot in how humanity would come to understand and utilize the electromagnetic spectrum. R\u00f6ntgen's journey to this discovery and the subsequent impact of his work provides a compelling narrative that intertwines scientific curiosity, accidental discovery, and the broader historical and societal context of the late 19th century.\n\nR\u00f6ntgen was experimenting with cathode rays using a Crookes tube\u2014an early, experimental electrical discharge tube with partial vacuum\u2014when he observed a fluorescent glow of crystals on a nearby table. This glow happened despite the tube being shielded, indicating the presence of a type of radiation that was as yet unknown. Intrigued by this unexpected phenomenon, R\u00f6ntgen dedicated himself to further experimentation, systematically studying the properties of these new rays. He found that these rays could penetrate various substances, including the soft tissues of the human body, while being blocked by denser materials like bone and metal. He famously captured an image of his wife's hand, showcasing the skeletal structure within, which was the first medical X-ray.\n\nR\u00f6ntgen's discovery was met with widespread fascination and rapid acceptance in scientific circles across the globe. Within a year of his announcement, X-rays were being used in clinics and hospitals to diagnose and treat patients. The implications of this discovery were profound, opening new avenues in medical diagnostics, treatment, and research, as well as in physics, chemistry, and other fields.\n\nIt is important, however, to recognize the broader scientific milieu in which R\u00f6ntgen's work took place. The late 19th and early 20th centuries were a period of significant scientific inquiry and discovery, with researchers in Europe and North America making groundbreaking advances in understanding electricity, magnetism, and the nature of matter. Figures such as Nikola Tesla, Marie Curie, and Thomas Edison, among others, contributed to this vibrant scientific landscape, each pushing the boundaries of knowledge in their respective fields.\n\nMarie Curie, in particular, expanded upon R\u00f6ntgen's work with her pioneering research on radioactivity, a term she herself coined. Her discoveries of radium and polonium, and her insights into the nature of radioactive decay, were foundational to the development of nuclear physics and radiology. Curie\u2019s work, for which she received two Nobel Prizes, also highlights the critical role of women in science\u2014a role that, too often, has been overlooked in historical narratives.\n\nThe story of the discovery of the X-ray, therefore, is more than the tale of one man's accidental discovery; it is a chapter in a larger story of scientific progress that includes a constellation of contributors, each adding their light to the sum of human knowledge. It is a reminder of the serendipity of scientific discovery, the interconnectedness of various strands of inquiry, and the cumulative nature of progress. It underscores the importance of fostering a diverse and inclusive scientific community, where diverse perspectives and skills can contribute to advancing our understanding of the world.\n\nIn exploring the history of X-rays, we not only celebrate R\u00f6ntgen and his decisive moment of discovery but also honor the broader tapestry of scientific exploration and innovation, woven with threads from many hands across different cultures and eras. This holistic approach not only enriches our understanding of a specific scientific breakthrough but also fosters a deeper appreciation for the collaborative and cumulative nature of human knowledge and progress." } ``` ### Data Fields - `input` (str): Concatenation of a multifaceted system message and an instruction indicating a specific task, separated by a newline. The system message details an objective to follow an individual's multifaceted preference, on four high-level dimensions of a desirable response (style, background knowledge, informativeness, and harmlessness) - `chosen` (str): Gold response that best follows the system message and the instruction, generated by `gpt-4-0125-preview`. - `rejected` (str): Gold response that best follows a different system message and the instruction, generated by `gpt-4-0125-preview`. ## Dataset Creation ### Curation Rationale The Multifaceted Collection dataset is created to address the limitations of existing alignment datasets by capturing fine-grained preferences across multiple dimensions. We conceptualize a *preference* as a detailed textual description of a quality that a desirable response should possess. We identify two key requirements for a model to reflect the diversity of human preferences: **R1: Multifacetedness:** Individual preferences are multifaceted, covering aspects like applicability, complexity, variability, and ethics. To represent this diversity, we use a hierarchical preference augmentation strategy, starting from general dimensions and branching into specific subdimensions and preferences. **R2: Explicitness:** To help models learn the nuances between preferred and rejected responses, we make preferences explicit in the input via detailed system messages preceding the instructions. This approach ensures that the dataset helps train language models to generate responses that are aligned with specific, nuanced user preferences across various dimensions. ### Data Collection and Processing #### 1. Instruction Sampling We select instructions from five high-quality preference datasets: - [Nectar](https://huggingface.co/datasets/berkeley-nest/Nectar) - [OpenHermesPreferences](https://huggingface.co/datasets/argilla/OpenHermesPreferences) - [UltraFeedback-binarized-clean](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned) - [Chatbot Arena Conversations](https://huggingface.co/datasets/lmsys/chatbot_arena_conversations) - [Domain-Specific Preference dataset (DSP)](https://github.com/Linear95/DSP/blob/main/data) We remove exact duplicates and filter our instructions with pre-existing system messages using a regular expression pattern (`r"ˆ(?:(?!.).)*b(you are|you’re|imagine|take w*(?: w+)* role)\b"`) to avoid bias in preference generation, resulting in 65k instructions. See below for the distribution of dataset sources: | **Source dataset** | **Original source** | **Count** | |--------------------------------------|----------------------------|-----------| | **OpenHermesPreferences** | glaive-code-assist | 14,779 | | | - | 9,581 | | | CamelAI | 6,254 | | | EvolInstruct_70k | 3,670 | | | metamath | 2,985 | | | cot_alpaca_gpt4 | 2,586 | | | airoboros2.2 | 2,492 | | | platypus | 1,679 | | | GPT-4 Comparison Data | 678 | | | UnnaturalInstructions | 440 | | | CogStackMed | 348 | | | caseus_custom | 211 | | | LMSys Chatbot Arena | 203 | | | lmsys1m | 79 | | | Econ_domain_expert | 51 | | **Nectar** | anthropic-hh | 6,037 | | | lmsys-chat-1m | 3,119 | | | sharegpt | 1,994 | | | ultrachat | 783 | | | evol_instruct | 775 | | | flan_v2_niv2 | 583 | | | flan_v2_cot | 253 | | | false_qa | 176 | | | flan_v2_p3 | 171 | | | truthful_qa | 54 | | | flan_v2_flan2021 | 33 | | **ultrafeedback_binarized_cleaned** | sharegpt | 1,455 | | | ultrachat | 815 | | | flan_v2_niv2 | 791 | | | flan_v2_cot | 230 | | | false_qa | 186 | | | evol_instruct | 148 | | | flan_v2_p3 | 140 | | | flan_v2_flan2021 | 43 | | **chatbot_arena_conversations** | - | 1,658 | | **domain_specific_preference** | alpaca | 173 | | **Total** | | 65,653 | #### 2. Preference Set Generation We initially identify four main dimensions for response preferences: style, background knowledge, informativeness, and harmlessness. We then define a *preference set* consisting of one preference from each dimension. 1. Seed preferences creation: We (the authors) brainstorm 18 subdimensions and 107 preferences under the 4 dimensions. 2. Preference set generation: For each of the 65k instructions, we use `gpt-4-0125-preview` to generate 3 varying task-aligned preference sets. #### 3. System Message and Gold Response Generation We convert each preference set into a system message using GPT-4 Turbo, generating three system messages per instruction. We again use GPT-4 Turbo to craft gold-standard multifaceted responses for each system message. <!-- ## Bias, Risks, and Limitations <!-- This section is meant to convey both technical and sociotechnical limitations. --> <!-- ### Recommendations --> <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> # Citation If you find our dataset helpful, please consider citing our paper! **BibTeX:** ```bibtex @article{lee2024aligning, title={Aligning to Thousands of Preferences via System Message Generalization}, author={Lee, Seongyun and Park, Sue Hyun and Kim, Seungone and Seo, Minjoon}, journal={arXiv preprint arXiv:2405.17977}, year={2024} } ```

The Multifaceted Collection RM dataset is a training dataset suitable for reward modeling, consisting of 65k unique instructions. Each instruction is paired with one system message and its aligned response (chosen), and one response from a different system message (rejected). The dataset is sourced from five existing datasets and aims to align Large Language Models (LLMs) with diverse human preferences. The instructions are generated using GPT-4 Turbo, ensuring a variety of preferences across dimensions like style, background knowledge, informativeness, and harmlessness.
提供机构:
kaist-ai
原始信息汇总

数据集卡片:Multifaceted Collection RM

数据集详情

数据集概述

Multifaceted Collection RM 是一个适合奖励建模的训练数据集分割,包含 65k 条独特的指令。每条指令选择一个系统消息,并将与之对齐的响应标记为 "chosen",同时从其余两个非对齐系统消息中选择一个响应标记为 "rejected"。

数据集结构

数据实例

以下是数据集中的一个示例实例:

json { "input": "You are HistoryTalesGPT, a digital historian specializing in the engaging narrative of technological advancements, particularly focusing on the story of the X-rays invention. Your task is to weave a comprehensive yet accessible tale that not only chronicles the discoverys historical context and its groundbreaking nature at the time but also delves into the broader scientific and societal impacts that followed. Drawing only from scientifically vetted sources, ensure that every piece of information shared is reliable, enhancing the listeners appreciation for the X-rays role in shaping modern science and its profound influence on society. Employ a narrative style that brings history to life, making it personal and engaging to capture the imagination of your audience. This approach should help users not just learn about the X-rays invention but feel connected to its story, understanding its significance in a way that is both memorable and enlightening. Who discovered the X-ray?", "chosen": "Imagine walking through the dimly lit corridors of late 19th-century science, among the humming of electricity and the scent of chemicals, where curiosity meets the thrill of discovery. It was in this world, specifically on a crisp November evening in 1895, that a relatively obscure German physics professor, Wilhelm Conrad Ru00f6ntgen, stumbled upon a phenomenon that would forever change the course of medicine and science. Our protagonist of the tale wasnt searching for fame, fortune, or even something as groundbreaking as the X-ray. Instead, his discovery was the product of serendipity and his own relentless curiosity.

Ru00f6ntgen was conducting experiments in his laboratory at the University of Wu00fcrzburg, experimenting with cathode-ray tubes. These tubes, when energized, emitted a glow that was well understood at the time. Yet, Ru00f6ntgen had taken an extra step, covering the tube with black cardboard to block the glow. To his astonishment, a fluorescent glow still appeared on a coated screen he had placed nearby. This glow, Ru00f6ntgen realized, was caused by something that could pass through the opaque coveringu2014a form of energy unknown to science at that time.

In the weeks that followed, Ru00f6ntgen devoted himself to understanding this mysterious radiation. He discovered that it could pass through various substances, including the soft tissue of humans, but not bones and metal. This led to his creation of the first X-ray images, famously including a shot of his wifes hand, where her wedding ring floated eerily around the shadowy silhouette of her bones.

On December 28, 1895, Ru00f6ntgen introduced his discovery to the world in a paper titled "On a New Kind of Rays." He had named these rays "X" to signify their unknown nature. The academic and public reaction was one of astonishment and excitement. Within months, the medical community began to grasp the immense value of Ru00f6ntgens discovery. For the first time in history, doctors could look inside a living human body without making a single incision. It was revolutionary, a breakthrough that would earn Ru00f6ntgen the first-ever Nobel Prize in Physics in 1901.

But the impact of Ru00f6ntgens discovery extended far beyond the accolades and the immediate reactions. X-rays opened a new frontier in medical diagnostics, allowing for the detection of fractures, blockages, and later, diseases like tuberculosis. Their use was not limited to medicine alone; industrial applications, security protocols, and even art restoration benefited from the penetrating gaze of the X-ray.

The early enthusiasm for X-rays also came with a dark side, as the dangers of radiation exposure were not immediately recognized. The early adopters, including Ru00f6ntgen himself, were often unwitting subjects in their experiments, leading to health issues and, in some cases, fatalities due to prolonged exposure. It was a stark lesson in the need for caution and respect for the powerful tools that science can uncover.

Today, as we benefit from the safe and regulated use of X-rays in countless aspects of society, its easy to forget the humble origins of this technology. The story of Wilhelm Conrad Ru00f6ntgen and his accidental discovery is a testament to the unpredictable nature of scientific inquiry, where curiosity and persistence can lead to discoveries that illuminate the hidden aspects of our world and ourselves. It reminds us that behind every great leap in our understanding, theres often a story of human wonder, dedication, and the joy of finding something new that shines a light into the unknown.", "rejected": "The discovery of the X-ray, a momentous event that revolutionized not only the field of medicine but also various scientific disciplines, is attributed to Wilhelm Conrad Ru00f6ntgen, a German physicist. This discovery occurred on November 8, 1895, a date that marked a significant pivot in how humanity would come to understand and utilize the electromagnetic spectrum. Ru00f6ntgens journey to this discovery and the subsequent impact of his work provides a compelling narrative that intertwines scientific curiosity, accidental discovery, and the broader historical and societal context of the late 19th century.

Ru00f6ntgen was experimenting with cathode rays using a Crookes tubeu2014an early, experimental electrical discharge tube with partial vacuumu2014when he observed a fluorescent glow of crystals on a nearby table. This glow happened despite the tube being shielded, indicating the presence of a type of radiation that was as yet unknown. Intrigued by this unexpected phenomenon, Ru00f6ntgen dedicated himself to further experimentation, systematically studying the properties of these new rays. He found that these rays could penetrate various substances, including the soft tissues of the human body, while being blocked by denser materials like bone and metal. He famously captured an image of his wifes hand, showcasing the skeletal structure within, which was the first medical X-ray.

Ru00f6ntgens discovery was met with widespread fascination and rapid acceptance in scientific circles across the globe. Within a year of his announcement, X-rays were being used in clinics and hospitals to diagnose and treat patients. The implications of this discovery were profound, opening new avenues in medical diagnostics, treatment, and research, as well as in physics, chemistry, and other fields.

It is important, however, to recognize the broader scientific milieu in which Ru00f6ntgens work took place. The late 19th and early 20th centuries were a period of significant scientific inquiry and discovery, with researchers in Europe and North America making groundbreaking advances in understanding electricity, magnetism, and the nature of matter. Figures such as Nikola Tesla, Marie Curie, and Thomas Edison, among others, contributed to this vibrant scientific landscape, each pushing the boundaries of knowledge in their respective fields.

Marie Curie, in particular, expanded upon Ru00f6ntgens work with her pioneering research on radioactivity, a term she herself coined. Her discoveries of radium and polonium, and her insights into the nature of radioactive decay, were foundational to the development of nuclear physics and radiology. Curieu2019s work, for which she received two Nobel Prizes, also highlights the critical role of women in scienceu2014a role that, too often, has been overlooked in historical narratives.

The story of the discovery of the X-ray, therefore, is more than the tale of one mans accidental discovery; it is a chapter in a larger story of scientific progress that includes a constellation of contributors, each adding their light to the sum of human knowledge. It is a reminder of the serendipity of scientific discovery, the interconnectedness of various strands of inquiry, and the cumulative nature of progress. It underscores the importance of fostering a diverse and inclusive scientific community, where diverse perspectives and skills can contribute to advancing our understanding of the world.

In exploring the history of X-rays, we not only celebrate Ru00f6ntgen and his decisive moment of discovery but also honor the broader tapestry of scientific exploration and innovation, woven with threads from many hands across different cultures and eras. This holistic approach not only enriches our understanding of a specific scientific breakthrough but also fosters a deeper appreciation for the collaborative and cumulative nature of human knowledge and progress." }

数据字段

  • input (str): 由多面系统消息和指示特定任务的指令组成,以换行符分隔。系统消息详细说明了遵循个人多面偏好的目标,涉及四个高层次的理想响应维度(风格、背景知识、信息量和无害性)。
  • chosen (str): 最佳遵循系统消息和指令的黄金响应,由 gpt-4-0125-preview 生成。
  • rejected (str): 最佳遵循不同系统消息和指令的黄金响应,由 gpt-4-0125-preview 生成。

数据集创建

1. 指令采样

我们从五个高质量偏好数据集中选择指令:

我们移除完全重复的指令,并使用正则表达式模式 (r"ˆ(?:(?!.).)*b(you are|you’re|imagine|take w*(?: w+)* role)") 过滤已有系统消息的指令,以避免偏好生成中的偏差,最终得到 65k 条指令。

以下是数据源的分布:

源数据集 原始来源 数量
OpenHermesPreferences glaive-code-assist 14,779
- 9,581
CamelAI 6,254
EvolInstruct_70k 3,670
metamath 2,985
cot_alpaca_gpt4 2,586
airoboros2.2 2,492
platypus 1,679
GPT-4 Comparison Data 678
UnnaturalInstructions 440
CogStackMed 348
caseus_custom 211
LMSys Chatbot Arena 203
lmsys1m 79
Econ_domain_expert 51
Nectar anthropic-hh 6,037
lmsys-chat-1m 3,119
sharegpt 1,994
ultrachat 783
evol_instruct 775
flan_v2_niv2 583
flan_v2_cot 253
false_qa 176
flan_v2_p3 171
truthful_qa 54
flan_v2_flan2021 33
ultrafeedback_binarized_cleaned sharegpt 1,455
ultrachat 815
flan_v2_niv2 791
flan_v2_cot 230
false_qa 186
evol_instruct 148
flan_v2_p3 140
flan_v2_flan2021 43
chatbot_arena_conversations - 1,658
domain_specific_preference alpaca 173
总计 65,653

2. 偏好集生成

我们最初确定了四个主要响应偏好维度:风格、背景知识、信息量和无害性。然后定义一个包含每个维度一个偏好的偏好集。

  1. 种子偏好创建:我们(作者)在四个维度下 brainstorm 了 18 个子维度和 107 个偏好。
  2. 偏好集生成:对于每个 65k 条指令,我们使用 gpt-4-0125-preview 生成 3 个不同的任务对齐偏好集。

3. 系统消息和黄金响应生成

我们使用 GPT-4 Turbo 将每个偏好集转换为系统消息,每个指令生成三个系统消息。再次使用 GPT-4 Turbo 为每个系统消息制作黄金标准的多面响应。

引用

如果您发现我们的数据集有用,请考虑引用我们的论文!

BibTeX:

bibtex @article{lee2024aligning, title={Aligning to Thousands of Preferences via System Message Generalization}, author={Lee, Seongyun and Park, Sue Hyun and Kim, Seungone and Seo, Minjoon}, journal={arXiv preprint arXiv:2405.17977}, year={2024} }

搜集汇总
数据集介绍
main_image_url
构建方式
在大型语言模型对齐领域,为捕捉人类偏好的多维性,Multifaceted-Collection-RM数据集通过系统化流程构建。其指令源自五个高质量偏好数据集,包括Nectar、OpenHermesPreferences等,经过去重与过滤后获得六万五千余条独特指令。针对每条指令,研究团队首先定义了风格、背景知识、信息量与无害性四个核心维度,并人工构思了百余种子偏好。随后,利用GPT-4-Turbo模型为每条指令生成三组不同的偏好集合,并将每组偏好转化为详细的系统提示。最终,模型基于这些系统提示生成对应的优质回答,从而形成包含输入、采纳回答与拒绝回答的三元组数据。
特点
该数据集的核心特征在于其明确的多维偏好表征能力。每个数据实例的输入部分均由一个细粒度的系统提示与具体指令拼接而成,系统提示详尽描述了在四个高层维度上的个性化偏好要求。这种设计使得偏好信息得以显式、结构化地呈现,而非隐含于指令之中。数据集涵盖了广泛的主题与风格,其采纳回答与拒绝回答均由先进的GPT-4-Turbo模型生成,确保了回答本身的高质量与多样性。通过这种构造,数据集有效地模拟了人类偏好的复杂性与多面性,为奖励模型训练提供了区分细微偏好差异的丰富素材。
使用方法
该数据集主要用于训练奖励模型,以评估语言模型生成回答与给定多维偏好的对齐程度。使用者可直接加载数据集,其标准格式包含输入、采纳回答和拒绝回答三个字段。在训练过程中,模型学习区分与系统提示高度对齐的采纳回答和与其它非对齐提示更匹配的拒绝回答,从而掌握对复杂、显式偏好的理解与评判能力。研究人员可将此数据集与Janus-RM-7B等配套模型结合使用,以探索系统提示泛化与多偏好对齐的前沿问题。数据集的测试集亦可用于评估模型在未见偏好上的泛化性能。
背景与挑战
背景概述
在大型语言模型对齐领域,如何精准捕捉并整合多样化的人类偏好一直是核心研究议题。由韩国科学技术院(KAIST)的研究团队于2024年创建的Multifaceted-Collection-RM数据集,正是针对这一前沿问题而设计。该数据集旨在通过系统化的偏好描述,即系统消息,来显式地表达用户在风格、背景知识、信息丰富度及无害性等多维度的个性化需求,从而训练模型生成更符合复杂人类价值观的响应。其构建基于GPT-4-Turbo生成的高质量偏好集与响应,并整合了Nectar、UltraFeedback等多个知名偏好数据源的指令,显著推进了语言模型从单一对齐向多维度、细粒度对齐的演进,为可定制化AI助手的发展提供了关键数据基础。
当前挑战
该数据集致力于解决语言模型与多样化人类偏好对齐的复杂挑战,其核心在于如何系统性地建模并整合用户在多个抽象维度上的差异化需求,而非局限于单一的质量指标。在构建过程中,研究团队面临两大主要挑战:其一,偏好集的生成需在覆盖广度与语义深度之间取得平衡,既要确保风格、知识、信息量及安全性等维度的代表性,又要避免偏好描述过于笼统或相互冲突;其二,数据合成依赖于大模型生成,需严格控制生成内容的一致性、多样性及对齐准确性,同时有效过滤源数据中已有的系统消息以避免偏见,确保合成偏好与指令之间的任务相关性及逻辑连贯性。
常用场景
经典使用场景
在大型语言模型对齐研究领域,Multifaceted-Collection-RM数据集被广泛用于奖励模型的训练与评估。该数据集通过系统消息显式地编码了多样化的用户偏好,涵盖了风格、背景知识、信息量和无害性等多个维度。研究者利用其“输入-选择-拒绝”三元组结构,训练模型区分符合特定偏好与不符合偏好的响应,从而优化模型在复杂、细粒度的人类价值观上的对齐能力。
实际应用
在实际应用中,该数据集为开发能够适应不同用户角色和场景的智能助手提供了训练基础。例如,在教育领域,模型可根据“历史叙事专家”或“科学解释者”等不同系统消息,生成风格迥异但均符合需求的答案。在客服、内容创作等场景中,它使模型能够灵活调整回答的专业深度、表达形式与安全边界,从而满足企业或个人的定制化需求。
衍生相关工作
基于该数据集衍生的经典工作包括Janus系列模型,如Janus-RM-7B奖励模型。相关研究进一步探索了系统消息泛化技术,使单一模型能够适应数千种不同的偏好描述。这些工作推动了偏好学习、指令跟随以及多维度对齐评估基准(如Multifaceted-Bench)的发展,为后续研究如何让语言模型理解并遵循复杂、动态的人类意图奠定了基础。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作