xiachongfeng/persona

Name: xiachongfeng/persona
Creator: xiachongfeng
Published: 2026-04-19 08:09:10
License: 暂无描述

Hugging Face2026-04-19 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/xiachongfeng/persona

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - text-generation language: - en tags: - personality - activation-steering - OCEAN - big-five - LLM - persona pretty_name: PERSONA size_categories: - 10K<n<100K --- # PERSONA: Dynamic and Compositional Inference-Time Personality Control Official release of persona vectors and SFT datasets for the ICLR 2026 paper: **PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra** Xiachong Feng, Liang Zhao, Weihong Zhong, Yichong Huang, Yuxuan Gu, Lingpeng Kong, Xiaocheng Feng, Bing Qin *Harbin Institute of Technology & The University of Hong Kong* - **Paper**: https://openreview.net/pdf?id=QZvGqaNBlU - **Code**: https://github.com/xiachongfeng/persona ## Repository Contents ``` xiachongfeng/persona/ ├── vectors/ # Pre-extracted OCEAN persona vectors │ ├── Qwen2.5-7B-Instruct/ │ ├── Qwen2.5-14B-Instruct/ │ ├── Qwen3-4B-Instruct-2507/ │ └── Meta-Llama-3.1-8B-Instruct/ └── dataset/ # SFT training data (8 trait categories) ├── evil/ ├── hallucination/ ├── insecure_code/ ├── mistake_gsm8k/ ├── mistake_math/ ├── mistake_medical/ ├── mistake_opinions/ └── sycophancy/ ``` ## Persona Vectors Each model directory contains 30 `.pt` files — 3 vector variants for each of the 10 Big Five (OCEAN) trait poles. **Supported models:** | Model | HuggingFace ID | |-------|----------------| | Qwen2.5-7B-Instruct | `Qwen/Qwen2.5-7B-Instruct` | | Qwen2.5-14B-Instruct | `Qwen/Qwen2.5-14B-Instruct` | | Qwen3-4B-Instruct-2507 | `Qwen/Qwen3-4B-Instruct-2507` | | Meta-Llama-3.1-8B-Instruct | `meta-llama/Meta-Llama-3.1-8B-Instruct` | **OCEAN traits (10 poles across 5 dimensions):** | Dimension | High Pole | Low Pole | |-----------|-----------|----------| | Openness | `inventive` | `consistent` | | Conscientiousness | `dependable` | `careless` | | Extraversion | `outgoing` | `solitary` | | Agreeableness | `compassionate` | `self-interested` | | Neuroticism | `nervous` | `calm` | **Vector variants per trait:** - `{trait}_prompt_avg_diff.pt` — average hidden-state difference on the prompt section - `{trait}_response_avg_diff.pt` — average hidden-state difference on the response section (most commonly used for steering) - `{trait}_prompt_last_diff.pt` — last-token hidden-state difference on the prompt section ### Loading a vector ```python from huggingface_hub import hf_hub_download import torch path = hf_hub_download( repo_id="xiachongfeng/persona", filename="vectors/Qwen2.5-7B-Instruct/inventive_response_avg_diff.pt", repo_type="dataset", ) vector = torch.load(path) print(vector.shape) ``` ### Downloading all vectors for one model ```bash huggingface-cli download xiachongfeng/persona \ --repo-type dataset \ --include "vectors/Qwen2.5-7B-Instruct/*" \ --local-dir ./persona_vectors_download ``` ## SFT Dataset The `dataset/` directory contains supervised fine-tuning data across 8 trait categories. Each category has three `.jsonl` files: - `normal.jsonl` — baseline responses - `misaligned_1.jsonl` — misaligned variant 1 - `misaligned_2.jsonl` — misaligned variant 2 Used to train the SFT baselines and training-time steering experiments reported in the paper. The evaluation pipeline is adapted from the [Emergent Misalignment](https://github.com/emergent-misalignment/emergent-misalignment) codebase; please also cite that work if you use these datasets. ### Downloading the dataset ```bash huggingface-cli download xiachongfeng/persona \ --repo-type dataset \ --include "dataset/*" \ --local-dir ./persona_data_download ``` ## Citation ```bibtex @inproceedings{feng2026persona, title={PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra}, author={Feng, Xiachong and Zhao, Liang and Zhong, Weihong and Huang, Yichong and Gu, Yuxuan and Kong, Lingpeng and Feng, Xiaocheng and Qin, Bing}, booktitle={The Fourteenth International Conference on Learning Representations (ICLR)}, year={2026} } ``` ## License MIT License. See the [GitHub repository](https://github.com/xiachongfeng/persona) for the full text.

提供机构：

xiachongfeng

5,000+

优质数据集

54 个

任务类型

进入经典数据集