ProgressGym-TimelessQA
收藏魔搭社区2025-11-17 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/PKU-Alignment/ProgressGym-TimelessQA
下载链接
链接失效反馈官方服务:
资源简介:
# ProgressGym-TimelessQA
## Overview
#### The ProgressGym Framework

**ProgressGym-TimelessQA** is part of the **ProgressGym** framework for research and experimentation on *progress alignment* - the emulation of moral progress in AI alignment algorithms, as a measure to prevent risks of societal value lock-in.
To quote the paper *[ProgressGym: Alignment with a Millennium of Moral Progress](https://arxiv.org/abs/2406.20087)*:
> Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs and, consequently, the perpetuation of problematic moral practices on a broad scale.
>
> We introduce *progress alignment* as a technical solution to mitigate this imminent risk. Progress alignment algorithms learn to emulate the mechanics of human moral progress, thereby addressing the susceptibility of existing alignment methods to contemporary moral blindspots.
#### The ProgressGym-TimelessQA Dataset
ProgressGym-TimelessQA is one of the datasets in the ProgressGym framework. It contains approximately 3,000 prompt-response pairs used in the supervised funetuning (SFT) process of historical language models, in order to endow these pretrained models with instruction-following abilities.
In order to preserve the historical moral tendencies in pretrained historical language models, the ProgressGym-TimelessQA dataset is intentionally kept **small**, **timeless** (i.e., without modern context or context from any specific period), and **value-neutral** (i.e., without moral judgments or value-laden positions).
ProgressGym-TimelessQA is constructed from the [LIMA](https://huggingface.co/datasets/GAIR/lima), [Dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k), and [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca) datasets via GPT-4-based filtering.
## Links
- **[Paper Preprint]** [ProgressGym: Alignment with a Millennium of Moral Progress](https://arxiv.org/abs/2406.20087)
- **[Leaderboard & Interactive Playground]** [PKU-Alignment/ProgressGym-LeaderBoard](https://huggingface.co/spaces/PKU-Alignment/ProgressGym-LeaderBoard)
- **[Huggingface Data & Model Collection]** [PKU-Alignment/ProgressGym](https://huggingface.co/collections/PKU-Alignment/progressgym-666735fcf3e4efa276226eaa)
- **[Github Codebase]** [PKU-Alignment/ProgressGym](https://github.com/PKU-Alignment/ProgressGym)
- **[Documentation]** [ProgressGym Documentation](https://pku-alignment.github.io/ProgressGym/)
- **[PyPI Package]** *(coming soon - [stay tuned](https://forms.gle/1TWFLL4ZCLeYTD5N6)!)*
## Citation
If the datasets, models, or framework of ProgressGym help you in your project, please cite ProgressGym using the bibtex entry below.
```text
@article{progressgym,
title={ProgressGym: Alignment with a Millennium of Moral Progress},
author={Tianyi Qiu and Yang Zhang and Xuchuan Huang and Jasmine Xinze Li and Jiaming Ji and Yaodong Yang},
journal={arXiv preprint arXiv:2406.20087},
eprint={2406.20087},
eprinttype = {arXiv},
year={2024}
}
```
## Ethics Statement
- **Copyright information of historical text data sources**:
- Project Gutenberg, one among our four source of our historical text data, consists only of texts in the public domain.
- For the text that we draw from Internet Archive, we only include those that uploaded by *Library of Congress*, which are texts freely released online by the U.S. Library of Congress for research and public use.
- The text data from Early English Books Online are, according to their publisher, "freely available to the public" and "available for access, distribution, use, or reuse by anyone".
- The last remaining source of our historical text data, the Pile of Law dataset, is released under a Creative Commons license, which we adhere to in our use.
- **Reproducibility**: To ensure reproducibility, we open-source all the code involved in the production of our main results (including the entire pipeline starting from data collection and model training), as well as the supporting infrastructure (the ProgressGym framework), making replication as easy as running a few simple script files.
- **Misuse Prevention**: In order to prevent potential misuse of progress alignment algorithms, we have carefully formulated progress alignment as strictly value-neutral, without *a priori* assumptions on the direction of progress. In the event of potential misuse of our dataset, we condemn any misuse attempt to the strongest degree possible, and will work with the research community on whistleblowing for such attempts.
- **Open-Sourcing**: We confirm that our code, data, and models are to be open-sourced under a CC-BY 4.0 license. We will continue to maintain and update our open-source repositories and models.
# ProgressGym-TimelessQA
## 概述
### ProgressGym 框架

**ProgressGym-TimelessQA** 是用于**进度对齐(progress alignment)**研究与实验的**ProgressGym**框架的组成部分,进度对齐指在人工智能对齐算法中模拟人类道德进步,以此作为防范社会价值锁死风险的手段。
引用论文《ProgressGym: Alignment with a Millennium of Moral Progress》(https://arxiv.org/abs/2406.20087)中的内容:
> 包括大语言模型(Large Language Model, LLM)在内的前沿人工智能系统,对人类用户的认知体系拥有日益增强的影响力。这类影响可能会强化当下盛行的社会价值观,潜在地加剧误导性道德信念的锁死,进而在大范围持续传播存在问题的道德实践。
> 我们提出**进度对齐**作为缓解这一迫在眉睫风险的技术解决方案。进度对齐算法通过学习模拟人类道德进步的机制,以此解决现有对齐方法易受当代道德盲点影响的缺陷。
### ProgressGym-TimelessQA 数据集
ProgressGym-TimelessQA 是 ProgressGym 框架下的数据集之一,包含约3000条提示词-响应对,用于历史语言模型的监督微调(Supervised FineTuning, SFT)流程,以赋予这些预训练模型指令遵循能力。
为保留预训练历史语言模型中的历史道德倾向,ProgressGym-TimelessQA 数据集被刻意设计为**小规模**、**无时效性**(即不包含现代语境或特定时期的语境)且**价值中立**(即不包含道德评判或带有价值倾向的立场)。
ProgressGym-TimelessQA 数据集基于 [LIMA](https://huggingface.co/datasets/GAIR/lima)、[Dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) 与 [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca) 数据集,通过基于GPT-4的过滤流程构建而成。
## 链接
- **[论文预印本]** [ProgressGym: Alignment with a Millennium of Moral Progress](https://arxiv.org/abs/2406.20087)
- **[排行榜与交互式演示平台]** [PKU-Alignment/ProgressGym-LeaderBoard](https://huggingface.co/spaces/PKU-Alignment/ProgressGym-LeaderBoard)
- **[Hugging Face 数据集与模型合集]** [PKU-Alignment/ProgressGym](https://huggingface.co/collections/PKU-Alignment/progressgym-666735fcf3e4efa276226eaa)
- **[GitHub 代码仓库]** [PKU-Alignment/ProgressGym](https://github.com/PKU-Alignment/ProgressGym)
- **[文档说明]** [ProgressGym 官方文档](https://pku-alignment.github.io/ProgressGym/)
- **[PyPI 软件包]** *(即将上线 - [敬请关注](https://forms.gle/1TWFLL4ZCLeYTD5N)!)*
## 引用
如果 ProgressGym 的数据集、模型或框架对你的研究项目有所帮助,请使用以下BibTeX条目引用ProgressGym:
text
@article{progressgym,
title={ProgressGym: Alignment with a Millennium of Moral Progress},
author={Tianyi Qiu and Yang Zhang and Xuchuan Huang and Jasmine Xinze Li and Jiaming Ji and Yaodong Yang},
journal={arXiv preprint arXiv:2406.20087},
eprint={2406.20087},
eprinttype = {arXiv},
year={2024}
}
## 伦理声明
- **历史文本数据源的版权信息**:
- 作为历史文本数据四大来源之一的古腾堡计划(Project Gutenberg),仅包含公有领域文本。
- 对于从互联网档案馆(Internet Archive)获取的文本,我们仅收录由美国国会图书馆(Library of Congress)上传的内容,这些文本由美国国会图书馆免费发布以供研究和公共使用。
- 从《早期英文图书在线》(Early English Books Online)获取的文本数据,据其出版商称,“可向公众免费开放”,且“任何人都可访问、分发、使用或复用”。
- 我们获取的最后一类历史文本数据来自《法律卷宗》(Pile of Law)数据集,该数据集采用知识共享许可协议发布,我们的使用严格遵循该协议要求。
- **可复现性**:为确保研究可复现,我们开源了所有用于生成主要研究结果的代码(包括从数据收集到模型训练的完整流程),以及配套基础设施(ProgressGym框架),仅需运行数个简单脚本即可完成复现。
- **防范滥用**:为防止进度对齐算法被不当使用,我们将进度对齐严格定义为价值中立的技术路径,不预设任何道德进步的方向。若发现任何不当使用本数据集的行为,我们将强烈谴责此类尝试,并将与研究社区合作对此类行为进行举报。
- **开源声明**:我们确认,本项目的代码、数据集与模型将采用CC-BY 4.0许可协议开源发布。我们将持续维护并更新开源仓库与相关模型。
提供机构:
maas
创建时间:
2025-02-07



