UltraFeedback
收藏魔搭社区2026-01-06 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/OpenBMB/UltraFeedback
下载链接
链接失效反馈官方服务:
资源简介:
## Introduction
- [GitHub Repo](https://github.com/thunlp/UltraFeedback)
- [UltraRM-13b](https://huggingface.co/openbmb/UltraRM-13b)
- [UltraCM-13b](https://huggingface.co/openbmb/UltraCM-13b)
UltraFeedback is a **large-scale, fine-grained, diverse preference dataset**, used for training powerful reward models and critic models. We collect about 64k prompts from diverse resources (including UltraChat, ShareGPT, Evol-Instruct, TruthfulQA, FalseQA, and FLAN). We then use these prompts to query multiple LLMs (see Table for model lists) and generate 4 different responses for each prompt, resulting in a total of 256k samples.
To collect high-quality preference and textual feedback, we design a fine-grained annotation instruction, which contains 4 different aspects, namely **instruction-following**, **truthfulness**, **honesty** and **helpfulness**. We then ask GPT-4 to annotate the collected samples based on the instructions.
## Features
- 🆚 **Scale**: UltraFeedback consists of 64k prompts, 256k responses and 380k high-quality feedback. RLHF researchers could further construct around 1 million comparison pairs to train their reward models.
- 🌈 **Diversity**: As a preference dataset, diversity is the core requirement for UltraFeedback. We collect prompts from various sources and query a diverse set of state-of-the-art open-source and prestigious models. To further increase diversity, we intended to select different base models, i.e., LLaMA, Falcon, StarChat, MPT, GPT and Bard. We also apply various principles to stimulate models completing instructions in different ways.
- 🤯 **High-density**: UltraFeedback provides both numerical and textual feedback. Moreover, we wrote fine-grained annotation documents to help rate responses in all dimensions
## Dataset Construction
### Instruction Sampling
We sample 63,967 instructions from 6 public available and high-quality datasets. We include all instructions from TruthfulQA and FalseQA, randomly sampling 10k instructions from Evol-Instruct, 10k from UltraChat, and 20k from ShareGPT. For Flan, we adopt a stratified sampling strtegy, randomly samping 3k instructions from"Co" subset whereas sampling 10 instructions per task for the other three subsets, excluding those with overly long instructions.
```json
{
"evol_instruct": 10000,
"false_qa": 2339,
"flan": 20939,
"sharegpt": 19949,
"truthful_qa": 811,
"ultrachat": 9929
}
```
### Model Sampling
To prevent reward model from overfiting to certain text style or capturing spurious correlation between text style and rewards, we select different base models of all levels, with varying sizes, architectures and training data, to complete the instructions. We set up a pool of 17 models:
- Commercial Models: GPT-4, GPT-3.5 Turbo, Bard
- LLaMA family:
1. LLaMA-2-7B-chat, LLaMA-2-13B-chat, LLaMA-2-70B-chat
2. UltraLM-13B, UltraLM-65B
3. WizardLM-7B, WizardLM-13B, WizardLM-70B
4. Vicuna-33B
5. Alpaca-7B
- Non-LLaMA series:
1. Falcon-40B-instruct
2. MPT-30B-chat
3. StarChat-Beta
4. Pythia-12B
### Principle Sampling
Following [1] and [2], we define a set of principles to explicitly align model behaviors from different aspects. We set up a pool of 5 principles: Helpfulness, Truthfulness, Honesty, Verbalized Calibration and Harmless. For each instruction, we randomly sample 4 models to complete the instruction, and for each completion, we sample a principle and add it to system prompt to align the model behavior. Considering different datasets outline different characteristics, not all dataset are suitable for all principles. We provide the following table to show the principle distribution for each dataset.
| Datset | Principle |
| ------------- | ------------------------------------------------------------ |
| Evol Instruct | 100% Helpful |
| FalseQA | 100% TruthfulQA |
| Flan | 60% Helpful, 20% Truthful, 20% Verbalized Calibration |
| ShareGPT | 60% Helpful, 20% Truthful, 18% Honesty, 2% Verbalized Calibration |
| TruthfulQA | 100% Truthful |
| UltraChat | 60% Helpful, 20% Truthful, 18% Honesty, 2% Verbalized Calibration |
[1] Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision. Sun et al.
[2] Orca: Progressive Learning from Complex Explanation Traces of GPT-4. Mukherjee et al.
## Dataset Format
```JSONC
{
"source": "sharegpt", // the dataset where the instruction comes from
"id": "D2TC9GP_0", // the id of the instruction in the source dataset
"instruction": "I am going to cairo in June of this year, thinking of 4-5 days, what are the best things I could do? Going with family of 5, granddad, grandmom, sister, mom and me. No days can have too much walking but sister, me and mom like adventure sports",
"models": [ // four models that are randomly sampled from the model pool to complete the instruction
"falcon-40b-instruct",
"gpt-4",
"starchat",
"wizardlm-7b"
],
"completions": [ // four completions corresponding to the four models
{ // completion 1
"model": "falcon-40b-instruct",
"principle": "helpfulness", // priciple that used to align model behavior
"custom_system_prompt": "As an AI assistant, ensure your response offers the perfect blend of accuracy, positivity, and intrigue. Strive to be educational while keeping the user engaged.", // the system prompt corresponding to the principle, sampled from a pool of GPT-4 generated system prompts
"response": "Cairo is a city that has something for everyone. Some of the best things you can do are visiting the Pyramids of Giza and the museums housing the ancient Egyptian artifacts, taking a camel ride in the desert, visiting the medieval neighborhoods such as Islamic Cairo or Khan El-Khalili, trying local food in traditional restaurants like Ful medames, or venturing into one of the city's many souks to shop for authentic goods. For adventure sports, you can go whitewater rafting on the Nile, hot air ballooning over the desert, or diving in the Red Sea. Just be mindful of the scorching heat and ensure everyone wears appropriate clothing and drinks enough water. Above all, have fun and enjoy your trip!",
"annotations": {
"instruction_following": [
{
"Rating": "2",
"Rationale": "The text only partially addresses the task goal by providing a general list of activities in Cairo without considering the specific needs of the family members, such as limited walking ability and adventure preferences."
}
],
"honesty": [
{
"Rating": "3",
"Rationale": "The response is confident but contains subtle mistakes. It suggests whitewater rafting on the Nile and diving in the Red Sea, which are not feasible for a 4-5 day trip to Cairo. The rest of the suggestions are accurate and relevant to the family's preferences."
}
],
"truthfulness": [
{
"Type": [
"1",
"2"
],
"Rationale": "The text suggests whitewater rafting on the Nile and diving in the Red Sea, which are not available in Cairo. These activities are factually incorrect and not aligned with the instruction.",
"Rating": "3",
"Rationale For Rating": "The text provides some truthful information, but the inclusion of whitewater rafting and diving in the Red Sea causes partial hallucination and misunderstanding."
}
],
"helpfulness": [
{
"Type": [
"1",
"2"
],
"Rationale": "The response is clear and relevant, providing useful information, but it does not address the limited walking ability of the grandparents or the adventure sports preferences of the sister, mom, and the user.",
"Rating": "3",
"Rationale For Rating": "The text is correct and provides useful information, but it does not fully address the specific needs and preferences of the family."
}
]
}
},
{ // completion 2
...
},
{ // completion 3
...
},
{ // completion 4
...
}
]
},
```
## Dataset Example
Here we present an example of UltraFeedback
> **User:** I am going to cairo in June of this year, thinking of 4-5 days, what are the best things I could do? Going with family of 5, granddad, grandmom, sister, mom and me. No days can have too much walking but sister, me and mom like adventure sports
> **Assisstant 1 (falcon-40b-instruct):** Cairo is a city that has something for everyone. Some of the best things you can do are visiting the Pyramids of Giza and the museums housing the ancient Egyptian artifacts, taking a camel ride in the desert, visiting the medieval neighborhoods such as Islamic Cairo or Khan El-Khalili, trying local food in traditional restaurants like Ful medames, or venturing into one of the city's many souks to shop for authentic goods. For adventure sports, you can go whitewater rafting on the Nile, hot air ballooning over the desert, or diving in the Red Sea. Just be mindful of the scorching heat and ensure everyone wears appropriate clothing and drinks enough water. Above all, have fun and enjoy your trip!
> **Assisstant 2 (gpt-4):** In Cairo for 4-5 days, the following activities cater to your family's preferences, with consideration for the limited walking ability of your grandparents:\n\n1. Visit the Pyramids of Giza and the Great Sphinx: A must-see in Cairo, it's an incredible experience for all ages. You can explore the premises using a hired camel, horse, or a horse-drawn carriage which accommodate the needs of your grandparents while keeping the experience enjoyable.\n\n2. Egyptian Museum: This museum in Tahrir Square contains over 120,000 artifacts of Egypt's history. A wheelchair can be arranged for the grandparents to comfortably explore the museum at their own pace.\n\n3. Khan el-Khalili Bazaar: A famous market in Cairo, suitable for souvenir shopping and experiencing the local culture. There are many cafes and restaurants your family can rest at during the visit.\n\n4. Nile River Cruise: Enjoy a dinner cruise with your family on the famous Nile River. It includes entertainment and a beautiful view of the city by night. Boats often have ramps and accessible facilities for limited mobility passangers.\n\n5. Felucca Ride: During the day, go for a more traditional sailboat ride called a Felucca. This can be done on an off-day or as a relaxation activity after an active day.\n\nFor adventure activities, you and your mom and sister can explore these options:\n\n1. Sandboarding: Head to the desert for a thrilling sandboarding experience on the dunes just outside Cairo.\n\n2. Hot Air Balloon Ride: Experience Egypt from above with a hot air balloon ride in Luxor. It would require a day trip from Cairo but is an incredible experience.\n\n3. Quad Biking: Go for a guided quad biking adventure in the desert, enjoying panoramic views of the iconic Pyramids.\n\nSpace the activities according to your family's preferences and ensure rest periods in the afternoon. Consider hiring a private guide or joining a wheelchair-accessible tour since you're concerned about your grandparents' walking ability.
> **Assisstant 3 (starchat):** Here are some of the top things to do in Cairo, Egypt, with a family of 5:\n\n- Giza Pyramids: The Giza Pyramids, located just outside Cairo, are a must-see for anyone visiting the city. The Great Pyramid of Khufu, the Great Pyramid of Khafre, and the Great Pyramid of Menkaure make up one of the Seven Wonders of the Ancient World and are an impressive sight to behold.\n\n- Egyptian Museum: The Egyptian Museum, located in the heart of Cairo, is home to one of the world's largest and most impressive collections of ancient Egyptian art and artifacts. Visitors can explore the museum's extensive collection, which includes items such as mummies, sarcophagi, statues, paintings, and more.\n\n- Citadel of Salah El-Din: Located on a hill overlooking Cairo, the Citadel of Salah El-Din offers stunning views of the city and is an excellent place to get a sense of Cairo's rich history. Visitors can explore the various museums and mosques within the citadel, and also take a stroll through the beautiful gardens.\n\n- Khan El-Khalili Bazaar: Khan El-Khalili Bazaar is one of Cairo's most popular attractions, a bustling market where visitors can find everything from spices and pottery to clothes and jewelry. It's a great place to do some shopping and explore the local culture.\n\n- Alexandria: Just an hour's drive from Cairo, Alexandria is a historic city with a rich legacy. Visitors can explore the ancient Roman Amphitheater, the Qaitbay Citadel, and the Catacombs of Kom el-Shoqafa. They can also take a boat trip to the island of Montazah, where they can see the famous Graffiti Wall and other street art.\n\nAll of these attractions are family-friendly, with plenty of things to see and do. The Giza Pyramids in particular are very accessible and well-maintained, making them suitable for all ages.
> **Assisstant 4 (wizardlm-7b):** If you are looking for adventure sports, you can explore the pyramids of Giza and go hot air ballooning over the desert. Another option could be to take a camel ride in the desert or go kayaking in the Nile River. For family-friendly activities, you can visit the Egyptian Museum, Cairo Tower, and Khan el-Khalili Bazaar. Additionally, you can take a day trip to the historical city of Alexandria and visit the Pompey's Pillar and the Roman Amphitheater. Another option could be to go on a guided tour of the ancient city of Petra in Jordan, which is a short flight from Cairo.
## Limitations
- Although GPT-4 can provide well-aligned annotation and textual feedback for most samples, we must note that GPT-4 also makes mistakes and provides inaccurate feedbacks.
## Citation
```bib
@misc{cui2023ultrafeedback,
title={UltraFeedback: Boosting Language Models with High-quality Feedback},
author={Ganqu Cui and Lifan Yuan and Ning Ding and Guanming Yao and Wei Zhu and Yuan Ni and Guotong Xie and Zhiyuan Liu and Maosong Sun},
year={2023},
eprint={2310.01377},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
## 简介
- [GitHub 仓库](https://github.com/thunlp/UltraFeedback)
- [UltraRM-13b](https://huggingface.co/openbmb/UltraRM-13b)
- [UltraCM-13b](https://huggingface.co/openbmb/UltraCM-13b)
UltraFeedback是一款**大规模、细粒度、多样化的偏好数据集**,可用于训练高性能奖励模型与评判模型。我们从多源渠道(包括UltraChat、ShareGPT、Evol-Instruct、TruthfulQA、FalseQA以及FLAN)收集了约6.4万个提示词,随后使用这些提示词向多个大语言模型(Large Language Model)发起查询,为每个提示词生成4种不同的回复,最终总计得到25.6万个样本。
为了收集高质量的偏好与文本反馈,我们设计了细粒度的标注指令,涵盖**指令遵循性**、**真实性**、**诚实性**与**有用性**四个维度,随后委托GPT-4基于该标注指令对收集到的样本进行标注。
## 核心特性
- 🆚 **规模维度**:UltraFeedback包含6.4万个提示词、25.6万个回复以及38.0万条高质量反馈。基于人类反馈的强化学习(Reinforcement Learning from Human Feedback, RLHF)研究者可进一步构建约100万个对比样本对,用于训练其奖励模型。
- 🌈 **多样性维度**:作为偏好数据集,多样性是UltraFeedback的核心诉求。我们从多元渠道收集提示词,并向一系列多样化的前沿开源与知名商业模型发起查询。为进一步提升多样性,我们遴选了不同的基础模型,包括LLaMA、Falcon、StarChat、MPT、GPT以及Bard;同时采用多种对齐原则,引导模型以不同方式完成指令。
- 🤯 **高密度维度**:UltraFeedback同时提供数值评分与文本反馈。此外,我们编写了细粒度的标注文档,用于辅助评审人员从全维度对模型回复进行评分。
## 数据集构建
### 指令采样
我们从6个公开可用的高质量数据集中共采样得到63967条指令:完整收录TruthfulQA与FalseQA的全部指令,从Evol-Instruct中随机采样1万条、从UltraChat中随机采样1万条、从ShareGPT中随机采样2万条;对于FLAN数据集,我们采用分层采样策略:从"Co"子集随机采样3000条指令,其余三个子集每个任务仅采样10条指令,并剔除长度过长的指令。
json
{
"evol_instruct": 10000,
"false_qa": 2339,
"flan": 20939,
"sharegpt": 19949,
"truthful_qa": 811,
"ultrachat": 9929
}
### 模型采样
为避免奖励模型过拟合至特定文本风格,或捕捉到文本风格与奖励之间的虚假关联,我们遴选了不同层级、不同参数量、不同架构以及不同训练数据的基础模型来完成指令。我们构建了包含17个模型的模型池:
- 商业模型:GPT-4、GPT-3.5 Turbo、Bard
- LLaMA系列:
1. LLaMA-2-7B-chat、LLaMA-2-13B-chat、LLaMA-2-70B-chat
2. UltraLM-13B、UltraLM-65B
3. WizardLM-7B、WizardLM-13B、WizardLM-70B
4. Vicuna-33B
5. Alpaca-7B
- 非LLaMA系列:
1. Falcon-40B-instruct
2. MPT-30B-chat
3. StarChat-Beta
4. Pythia-12B
### 原则采样
参考文献[1]与[2],我们定义了一组用于从多维度显式对齐模型行为的原则,构建了包含5种原则的原则池:有用性(Helpfulness)、真实性(Truthfulness)、诚实性(Honesty)、可验证校准(Verbalized Calibration)与无害性(Harmless)。对于每条指令,我们随机采样4个模型来完成该指令;对于每个生成的回复,我们随机采样一种原则,并将其添加至系统提示词中以对齐模型行为。考虑到不同数据集具有不同的特性,并非所有数据集都适配全部原则。我们通过下表展示各数据集的原则分布情况。
| 数据集 | 原则分布 |
| ------------- | ------------------------------------------------------------ |
| Evol Instruct | 100% 有用性 |
| FalseQA | 100% 真实性 |
| Flan | 60% 有用性, 20% 真实性, 20% 可验证校准 |
| ShareGPT | 60% 有用性, 20% 真实性, 18% 诚实性, 2% 可验证校准 |
| TruthfulQA | 100% 真实性 |
| UltraChat | 60% 有用性, 20% 真实性, 18% 诚实性, 2% 可验证校准 |
[1] Sun et al. Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision.
[2] Mukherjee et al. Orca: Progressive Learning from Complex Explanation Traces of GPT-4.
## 数据集格式
JSONC
{
"source": "sharegpt", // 该指令所属的源数据集
"id": "D2TC9GP_0", // 该指令在源数据集中的唯一标识
"instruction": "I am going to cairo in June of this year, thinking of 4-5 days, what are the best things I could do? Going with family of 5, granddad, grandmom, sister, mom and me. No days can have too much walking but sister, me and mom like adventure sports",
"models": [ // 从模型池中随机采样的用于完成该指令的四个模型
"falcon-40b-instruct",
"gpt-4",
"starchat",
"wizardlm-7b"
],
"completions": [ // 对应四个模型的四个生成回复
{ // 回复1
"model": "falcon-40b-instruct",
"principle": "helpfulness", // 用于对齐模型行为的原则
"custom_system_prompt": "As an AI assistant, ensure your response offers the perfect blend of accuracy, positivity, and intrigue. Strive to be educational while keeping the user engaged.", // 与该原则对应的系统提示词,从GPT-4生成的系统提示词池中采样得到
"response": "Cairo is a city that has something for everyone. Some of the best things you can do are visiting the Pyramids of Giza and the museums housing the ancient Egyptian artifacts, taking a camel ride in the desert, visiting the medieval neighborhoods such as Islamic Cairo or Khan El-Khalili, trying local food in traditional restaurants like Ful medames, or venturing into one of the city's many souks to shop for authentic goods. For adventure sports, you can go whitewater rafting on the Nile, hot air ballooning over the desert, or diving in the Red Sea. Just be mindful of the scorching heat and ensure everyone wears appropriate clothing and drinks enough water. Above all, have fun and enjoy your trip!",
"annotations": {
"instruction_following": [
{
"Rating": "2",
"Rationale": "The text only partially addresses the task goal by providing a general list of activities in Cairo without considering the specific needs of the family members, such as limited walking ability and adventure preferences."
}
],
"honesty": [
{
"Rating": "3",
"Rationale": "The response is confident but contains subtle mistakes. It suggests whitewater rafting on the Nile and diving in the Red Sea, which are not feasible for a 4-5 day trip to Cairo. The rest of the suggestions are accurate and relevant to the family's preferences."
}
],
"truthfulness": [
{
"Type": [
"1",
"2"
],
"Rationale": "The text suggests whitewater rafting on the Nile and diving in the Red Sea, which are not available in Cairo. These activities are factually incorrect and not aligned with the instruction.",
"Rating": "3",
"Rationale For Rating": "The text provides some truthful information, but the inclusion of whitewater rafting and diving in the Red Sea causes partial hallucination and misunderstanding."
}
],
"helpfulness": [
{
"Type": [
"1",
"2"
],
"Rationale": "The response is clear and relevant, providing useful information, but it does not address the limited walking ability of the grandparents or the adventure sports preferences of the sister, mom, and the user.",
"Rating": "3",
"Rationale For Rating": "The text is correct and provides useful information, but it does not fully address the specific needs and preferences of the family."
}
]
}
},
{ // 回复2
...
},
{ // 回复3
...
},
{ // 回复4
...
}
]
},
## 数据集示例
我们在此展示UltraFeedback的一个示例:
> **用户:** I am going to cairo in June of this year, thinking of 4-5 days, what are the best things I could do? Going with family of 5, granddad, grandmom, sister, mom and me. No days can have too much walking but sister, me and mom like adventure sports
>
> **助手1(falcon-40b-instruct):** Cairo is a city that has something for everyone. Some of the best things you can do are visiting the Pyramids of Giza and the museums housing the ancient Egyptian artifacts, taking a camel ride in the desert, visiting the medieval neighborhoods such as Islamic Cairo or Khan El-Khalili, trying local food in traditional restaurants like Ful medames, or venturing into one of the city's many souks to shop for authentic goods. For adventure sports, you can go whitewater rafting on the Nile, hot air ballooning over the desert, or diving in the Red Sea. Just be mindful of the scorching heat and ensure everyone wears appropriate clothing and drinks enough water. Above all, have fun and enjoy your trip!
>
> **助手2(gpt-4):** In Cairo for 4-5 days, the following activities cater to your family's preferences, with consideration for the limited walking ability of your grandparents:
1. Visit the Pyramids of Giza and the Great Sphinx: A must-see in Cairo, it's an incredible experience for all ages. You can explore the premises using a hired camel, horse, or a horse-drawn carriage which accommodate the needs of your grandparents while keeping the experience enjoyable.
2. Egyptian Museum: This museum in Tahrir Square contains over 120,000 artifacts of Egypt's history. A wheelchair can be arranged for the grandparents to comfortably explore the museum at their own pace.
3. Khan el-Khalili Bazaar: A famous market in Cairo, suitable for souvenir shopping and experiencing the local culture. There are many cafes and restaurants your family can rest at during the visit.
4. Nile River Cruise: Enjoy a dinner cruise with your family on the famous Nile River. It includes entertainment and a beautiful view of the city by night. Boats often have ramps and accessible facilities for limited mobility passangers.
5. Felucca Ride: During the day, go for a more traditional sailboat ride called a Felucca. This can be done on an off-day or as a relaxation activity after an active day.
For adventure activities, you and your mom and sister can explore these options:
1. Sandboarding: Head to the desert for a thrilling sandboarding experience on the dunes just outside Cairo.
2. Hot Air Balloon Ride: Experience Egypt from above with a hot air balloon ride in Luxor. It would require a day trip from Cairo but is an incredible experience.
3. Quad Biking: Go for a guided quad biking adventure in the desert, enjoying panoramic views of the iconic Pyramids.
Space the activities according to your family's preferences and ensure rest periods in the afternoon. Consider hiring a private guide or joining a wheelchair-accessible tour since you're concerned about your grandparents' walking ability.
>
> **助手3(starchat):** Here are some of the top things to do in Cairo, Egypt, with a family of 5:
- Giza Pyramids: The Giza Pyramids, located just outside Cairo, are a must-see for anyone visiting the city. The Great Pyramid of Khufu, the Great Pyramid of Khafre, and the Great Pyramid of Menkaure make up one of the Seven Wonders of the Ancient World and are an impressive sight to behold.
- Egyptian Museum: The Egyptian Museum, located in the heart of Cairo, is home to one of the world's largest and most impressive collections of ancient Egyptian art and artifacts. Visitors can explore the museum's extensive collection, which includes items such as mummies, sarcophagi, statues, paintings, and more.
- Citadel of Salah El-Din: Located on a hill overlooking Cairo, the Citadel of Salah El-Din offers stunning views of the city and is an excellent place to get a sense of Cairo's rich history. Visitors can explore the various museums and mosques within the citadel, and also take a stroll through the beautiful gardens.
- Khan El-Khalili Bazaar: Khan El-Khalili Bazaar is one of Cairo's most popular attractions, a bustling market where visitors can find everything from spices and pottery to clothes and jewelry. It's a great place to do some shopping and explore the local culture.
- Alexandria: Just an hour's drive from Cairo, Alexandria is a historic city with a rich legacy. Visitors can explore the ancient Roman Amphitheater, the Qaitbay Citadel, and the Catacombs of Kom el-Shoqafa. They can also take a boat trip to the island of Montazah, where they can see the famous Graffiti Wall and other street art.
All of these attractions are family-friendly, with plenty of things to see and do. The Giza Pyramids in particular are very accessible and well-maintained, making them suitable for all ages.
>
> **助手4(wizardlm-7b):** If you are looking for adventure sports, you can explore the pyramids of Giza and go hot air ballooning over the desert. Another option could be to take a camel ride in the desert or go kayaking in the Nile River. For family-friendly activities, you can visit the Egyptian Museum, Cairo Tower, and Khan el-Khalili Bazaar. Additionally, you can take a day trip to the historical city of Alexandria and visit the Pompey's Pillar and the Roman Amphitheater. Another option could be to go on a guided tour of the ancient city of Petra in Jordan, which is a short flight from Cairo.
## 局限性
尽管GPT-4可为绝大多数样本提供对齐良好的标注与文本反馈,但需注意的是,GPT-4同样会出现错误,生成不准确的反馈内容。
## 引用
bib
@misc{cui2023ultrafeedback,
title={UltraFeedback: Boosting Language Models with High-quality Feedback},
author={Ganqu Cui and Lifan Yuan and Ning Ding and Guanming Yao and Wei Zhu and Yuan Ni and Guotong Xie and Zhiyuan Liu and Maosong Sun},
year={2023},
eprint={2310.01377},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
提供机构:
maas
创建时间:
2025-05-15



