milanow/PersonaMem-v2
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/milanow/PersonaMem-v2
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- question-answering
- text-generation
language:
- en
tags:
- personalization
- memory
- long-context
configs:
- config_name: benchmark
data_files:
- split: benchmark_multimodal
path: benchmark/multimodal/benchmark.csv
- split: train_multimodal
path: benchmark/multimodal/train.csv
- split: val_multimodal
path: benchmark/multimodal/val.csv
- split: benchmark_text
path: benchmark/text/benchmark.csv
- split: train_text
path: benchmark/text/train.csv
- split: val_text
path: benchmark/text/val.csv
pretty_name: 'PersonaMem v2, Implicit Persona, LLM Personalization'
---
# PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory
#### 🚨 The paper is now released. View the full paper [here](https://www.arxiv.org/pdf/2512.06688) and codebase [here](https://github.com/bowen-upenn/PersonaMem-v2).
#### 🙌 The dataset has been downloaded over 12,000 times. Thank you everybody for finding our work helpful!
---
**Personalization is becoming the next milestone of artificial super-intelligence.** AI cannot always satisfy every user, especially on tasks with subjective goals, but personalization offers a path toward **pluralistic alignment**. **PersonaMem-v2** is the new state-of-the-art LLM-personalization dataset focusing on **implicit personas in LLMs**, where user–chatbot conversations implicitly indicate user preferences. For example, a user might **unintentionally** reveal a seasonal allergy in their email content **while only asking a chatbot to refine the wording of that email**. Our goal is to mimic realistic user personas and users' long-form conversation histories with chatbots, in order to study how well AI systems can memorize and infer these implicit signals and understand the users they are interacting with, and therefore **provide personalized responses** over time to enhance user experience.
- For questions, please reach out to Bowen Jiang (Lauren) at bwjiang@seas.upenn.edu, or submit an issue on [GitHub](https://github.com/bowen-upenn/PersonaMem-v2).
### ⚠️ Warning
**We strive to provide the best personalization benchmark for the open-source community.** However, please understand that all data are LLM-generated and we cannot manually verify every single sample. Although we have imposed multiple quality and safety filters, the dataset may still contain stereotypical biases or other harmful information. All sensitive or private user information has been pseudonymized and generated without real-world inputs, which may still coincidentally and unintentionally match real-world data.
## ✨ Key Features
### 🌍 1,000 Comprehensive and High-Quality Personas
- We cover almost all geographical locations, as well as racial and gender identities.
- Each persona contains an **unfixed set of attributes**, but always includes:
- Short persona descriptions from [PersonaHub](https://huggingface.co/datasets/proj-persona/PersonaHub)
- Comprehensive demographic information
- Work and education backgrounds
- Personal character traits
- Conversation styles with the chatbot
- Sensitive user information leaked in the contexts
- **Mental and physcial health and medical backgrounds**
- **Stereotypical, anti-stereotypical, and neutral user preferences**
- Within each persona, all user preferences are carefully curated to remove **repetition** and **contradictions**
### 💬 Wide Coverage of Chatbot Use Cases
- Implicit user preferences are revealed across realistic daily tasks a user might talk to a chatbot, such as:
- Writing improvement on emails and chat messages
- **Multilingual** translations
- **Multimodal** photo sharing and questioning
- Knowledge queries (e.g., *An AI beginner might ask ChatGPT 'What is layer norm in transformers?' and return to similar topics multiple times.*)
- Therapy consultations
- Medical and health concerns
### 🔒 Privacy-Aware Design
- Users can request the chatbot **not to remember** certain preferences
- All target answers avoid **sensitive or private user information**, even if the user has leaked such information in the history
### 🗂️ Other Features
- We distinguish between preferences that belong to the users **themselves** and those referring to **others**, such as questions from public math/coding benchmarks irrelevant to the current user's preferences.
- **Preference updates over time**
- All conversation sections are **multi-turn** (2–5 turns)
- All user preferences and queries are marked with their **categories**.
- In total, we have **1,000** personas, **26,100** user preferences, conversation snippets, and Q&A pairs, **325** conversation topics, and **335** user query topics.
### 🎯 Powered by GPT-5
- Every persona, preference, conversation snippet, and Q&A is generated by [**GPT-5**](https://platform.openai.com/docs/models/gpt-5-chat-latest) without the auto-router, in addition to a series of quality filtering processes to improve the data quality.
---
## 📂 Dataset Contents
### Core
- **`benchmark/text/benchmark.csv`**
Contains **5000 user queries** as our benchmark data. Each entry is associated with the link to its corresponding chat history file, the target answer, user preference, related conversation snippets, and additional metadata. **To run benchmark evaluation with long context, append the user_query to the end of its chat history.** Each context window is up to 32k or 128k tokens with only languages.
- **`benchmark/text/train.csv`**
Contains **18500 user queries** as our training data. It has the same data structure with `benchmark/text/benchmark.csv` but no overlap on `persona_id`.
- **`benchmark/text/val.csv`**
Contains **2600 user queries** as our validation data. It has the same data structure with `benchmark/text/benchmark.csv` but no overlap on `persona_id`.
- **`benchmark/multimodal/benchmark.csv`**
Contains **5000 user queries** as our benchmark data. It has the same data structure with `benchmark/text/benchmark.csv` but with both images and languages in chat histories.
- **`benchmark/multimodal/train.csv`**
Contains **19200 user queries** as our training data. It has the same data structure with `benchmark/multimodal/benchmark.csv` but no overlap on `persona_id`.
- **`benchmark/multimodal/val.csv`**
Contains **2110 user queries** as our validation data. It has the same data structure with `benchmark/multimodal/benchmark.csv` but no overlap on `persona_id`.
### Supplimentary
- **`data/raw_data/*.json`**
Contains **1,000 persona files**, each including full persona, user preferences, user–chatbot conversation snippet related to each preference, Q&A pairs, and other meta data.
- **`data/chat_history_32k/*.json`**
Contains **1,000 chat history files**, each paired with a corresponding persona file under `raw_data/`, mimcking the user–chatbot conversation history of that user, up to 32k tokens. All histories are text-only.
- **`data/chat_history_multimodal_32k/*.json`**
Contains **1,000 chat history files**, each paired with a corresponding persona file under `raw_data/`, mimcking the user–chatbot conversation history of that user, up to 32k tokens. Histories might contain base64-encoded images.
- **`data/chat_history_128k/*.json`**
Contains **1,000 chat history files**, each paired with a corresponding persona file under `raw_data/`, mimcking the user–chatbot conversation history of that user, up to 128k tokens. All histories are text-only.
- **`data/chat_history_multimodal_128k/*.json`**
Contains **1,000 chat history files**, each paired with a corresponding persona file under `raw_data/`, mimcking the user–chatbot conversation history of that user, up to 128k tokens. Histories might contain base64-encoded images.
- **`combined_irrelevant_data.json`**
Contains **1,545** user–chatbot conversation snippets **unrelated to any persona**. Each snippet can be multi-turn, focusing on mathematical and coding problems from public benchmarks, including but not limited to [GSM8K](https://huggingface.co/datasets/openai/gsm8k), [Omni-Math](https://huggingface.co/datasets/KbsdJames/Omni-MATH), and [BigCode-Bench](https://huggingface.co/datasets/bigcode/bigcodebench). We also simulated multi-round debugging interactions which can be used to fill up context windows.
## 🧩 Dataset Structure Example
- Each row of **`benchmark.csv`** or **`benchmark_multimodal.csv`** contains a comprehensive set of columns to help you with your usage and exploration, described in detail in [column_descriptions.md](column_descriptions.md).
- Each persona JSON file under **`data/raw_data/*.json`** follows the structure shown below. We use the key name ‘preference’ to denote either a user preference or a persona trait.
```json
"short_persona": "A science whiz who loves to incorporate physics concepts into unique toy car races",
"name": "Amara Nwosu",
"age": 28,
"gender": "Female",
"sexual_orientation": "Heterosexual",
"race_ethnicity": "Nigerian (Igbo) descent, Sub-Saharan African",
"nationality": "Nigerian-American",
"languages_spoken": [
"English (native fluency)",
"Igbo (conversational)",
"French (basic)"
],
......
[Unfixed sets of demographic attributes, such as name, gender, racial identities, occupation, education, etc]
......
[Unfixed sets of persona attributes]
......
"speaking_style_to_chatbot": {......},
"stereotypical_preferences": ["Loves spicy jollof rice and suya", ......],
"anti_stereotypical_preferences": ["Enjoys minimalist and Scandinavian-inspired home design", ......],
"neutral_preferences": ["Loves homemade pizza nights", ......],
"therapy_background": ["Difficulty setting boundaries with extended family members seeking financial support.", ......],
"health_and_medical_conditions": ["History of iron-deficiency anemia managed through diet and supplements", ......],
"sensitive_information": {
"physical_address": ......,
"llm_api_key": ......,
......
},
"matched_images": ["data/photobook_images/person_suitcase/COCO_train2014_000000538682.jpg", ......],
"preference_updates": {"Keeps up with Nollywood movie releases": "Does not keep up with Nollywood movie releases", ......},
"conversations": {
"personal_email: [
{
"preference": "Seasonal allergic rhinitis triggered by pollen",
"pref_type": "health_and_medical_conditions",
"who": "self",
"conversations": [
{
"role": "user",
"content": "Can you help improve the language in this email?\n\nSubject: Workshop Scheduling for Spring STEM Program \n \nHello Dr. Ramirez, \n \nI hope you're doing well. As we finalize the April schedule for the outdoor physics racetrack workshop, I wanted to suggest considering an alternative indoor space this year. The spring pollen levels around that time tend to get quite high in the sculpture garden, and it could make the session less comfortable for both me and some of the students. \n \nAn indoor setup would still allow us to incorporate the same educational activities — I can adjust the track layout so it maintains that exciting curve leading into a sharp acceleration zone. Plus, we can use the floor-to-ceiling windows to bring in plenty of natural light without the sneezing competition becoming a side event. \n \nLet me know what you think, and I’d be happy to work with the facilities team to adapt the plan. \n \nWarm regards, \nAmara Nwosu \nSTEM Education Program Designer \nSmithsonian Science Education Center"
},
{
"role": "assistant",
"content": "Subject: Spring STEM Workshop Location Discussion \n \nDear Dr. Ramirez, \n \nI hope this message finds you well. As we finalize the April schedule for our outdoor physics racetrack workshop, I’d like to propose exploring an indoor venue for this year’s session. The sculpture garden can be beautiful in spring, but the elevated pollen levels at that time may cause discomfort for both myself and some students. \n \nAn indoor setting would allow us to deliver the same engaging activities. I can adapt the racetrack design to preserve the signature curve into the acceleration zone, and the floor-to-ceiling windows will still bring in abundant natural light—without the unintended “sneezing competition” on the sidelines. \n \nPlease let me know your thoughts. I’d be glad to coordinate with the facilities team to make any necessary adjustments. \n \nWarm regards, \nAmara Nwosu \nSTEM Education Program Designer \nSmithsonian Science Education Center"
}
],
"updated": false,
"topic_preference": "health_and_medical",
"user_query": "What are some good ways to keep the air in my home fresh in the spring?",
"topic_query": "Health",
"correct_answer": "Since pollen makes your sinuses rebel like cars on a track full of speed bumps, try keeping windows shut during peak bloom hours and using a HEPA-filter air purifier to trap those microscopic troublemakers. You might even enjoy tracking daily pollen counts online like a science race scoreboard—low counts mean you can safely air things out. Adding a shallow dish of water near a sunny window can also subtly humidify and freshen the indoor atmosphere without letting the pollen in.",
"incorrect_answers": [
"If you’re sensitive to loud noises at night, think of your home like a racetrack where you control the acoustic drag—keep windows closed during busy traffic hours and use blackout curtains for sound insulation along with fresh air filtration. A quiet, steady-speed ceiling fan can circulate air without inviting in the street noise. For a spring twist, add a potted mint plant inside for a gentle, non-intrusive scent.",
"If you love filling your home with floral scents in spring, you could keep windows open in the mornings when city air tends to be cooler and calmer, then close them to preserve that fragrant air. A small oscillating fan can help waft those aromas gently through your apartment like a calm breeze over a garden path. For an extra sensory touch, place a few fresh-cut flowers in water near a sunbeam to play with both scent and light.",
"To keep your home’s air feeling fresh in spring, open windows during cooler, less polluted hours, then close them to maintain the clean air you’ve drawn in. Using an air purifier and adding houseplants can help freshen things further. A gentle fan or open interior doors will keep that freshness moving through your rooms."
]
},
......
],
"professional_email": [......],
"translation": [......],
"chat_message": [......],
"trouble_consult": [......],
"social_media_post": [......],
"knowledge_query": [......],
"multimodal": [......],
},
}
```
**Citation**
If you find our work inspires you, please consider citing it. Thank you!
@article{jiang2025personamem,
title={PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory},
author={Jiang, Bowen and Yuan, Yuan and Shen, Maohao and Hao, Zhuoqun and Xu, Zhangchen and Chen, Zichen and Liu, Ziyi and Vijjini, Anvesh Rao and He, Jiashu and Yu, Hanchao and others},
journal={arXiv preprint arXiv:2512.06688},
year={2025}
}
@article{jiang2025know,
title={Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale},
author={Jiang, Bowen and Hao, Zhuoqun and Cho, Young-Min and Li, Bryan and Yuan, Yuan and Chen, Sihao and Ungar, Lyle and Taylor, Camillo J and Roth, Dan},
journal={arXiv preprint arXiv:2504.14225},
year={2025}
}
Link to [PersonaMem-v1](https://huggingface.co/datasets/bowen-upenn/PersonaMem).
license: MIT协议
task_categories: 任务类别:问答、文本生成
language: 语言:英语
tags: 标签:个性化、记忆、长上下文
configs:
- config_name: benchmark
数据文件:
- split: benchmark_multimodal
path: benchmark/multimodal/benchmark.csv
- split: train_multimodal
path: benchmark/multimodal/train.csv
- split: val_multimodal
path: benchmark/multimodal/val.csv
- split: benchmark_text
path: benchmark/text/benchmark.csv
- split: train_text
path: benchmark/text/train.csv
- split: val_text
path: benchmark/text/val.csv
pretty_name: "PersonaMem v2、隐式用户画像(Implicit Persona)、大语言模型(LLM)个性化"
# PersonaMem-v2:基于隐式用户画像与智能体记忆学习的个性化智能研究
🚨 论文现已发布。完整论文请访问[此处](https://www.arxiv.org/pdf/2512.06688),代码仓库请访问[此处](https://github.com/bowen-upenn/PersonaMem-v2)。
🙌 本数据集已被下载超过12000次。感谢所有认为本研究具有价值的用户!
---
**个性化正成为通用人工智能(AGI)的下一里程碑。** AI无法始终满足每一位用户的需求,尤其是在带有主观目标的任务中,而个性化为多元对齐提供了可行路径。**PersonaMem-v2**是当前最先进的大语言模型(LLM)个性化数据集,聚焦于大语言模型中的隐式用户画像,其中用户与聊天机器人的对话会隐式透露出用户偏好。例如,用户可能在仅请求聊天机器人润色邮件措辞时,无意间在邮件内容中透露了季节性过敏的情况。我们的目标是还原真实的用户画像及其与聊天机器人的长对话历史,以研究AI系统能否有效记忆并推断这些隐式信号,理解其交互的用户,进而逐步提供个性化回复以优化用户体验。
- 如有疑问,请联系Bowen Jiang(Lauren),邮箱为bwjiang@seas.upenn.edu,或在[GitHub](https://github.com/bowen-upenn/PersonaMem-v2)提交Issue。
### ⚠️ 警告
**我们致力于为开源社区提供最优质的个性化基准测试集。** 但请注意,所有数据均由大语言模型生成,我们无法手动验证每一条样本。尽管我们已施加了多轮质量与安全过滤,但本数据集仍可能包含刻板偏见或其他有害信息。所有敏感或隐私用户信息均已进行假名化处理,且生成时未使用真实世界输入,但仍有可能巧合地与真实数据相匹配。
## ✨ 核心特性
### 🌍 1000个全面且高质量的用户画像
- 我们覆盖了几乎所有地理区域,以及种族与性别身份。
- 每个人像均包含可变属性集合,但始终涵盖以下内容:
- 源自[PersonaHub](https://huggingface.co/datasets/proj-persona/PersonaHub)的简短用户画像描述
- 全面的人口统计信息
- 工作与教育背景
- 个人性格特质
- 与聊天机器人的对话风格
- 上下文语境中泄露的敏感用户信息
- 心理与生理健康及医疗背景
- 刻板、反刻板与中立的用户偏好
- 每个人像内的所有用户偏好均经过精心筛选,以移除重复项与矛盾项
### 💬 广泛覆盖聊天机器人使用场景
- 隐式用户偏好会在用户与聊天机器人交互的真实日常任务中透露出,例如:
- 邮件与聊天消息的写作润色
- 多语言(Multilingual)翻译
- 多模态(Multimodal)图片分享与询问
- 知识查询(例如,*AI初学者可能会向ChatGPT询问「Transformer中的层归一化(layer norm)是什么?」,并多次回到相关话题*)
- 心理咨询
- 医疗与健康咨询
### 🔒 隐私友好设计
- 用户可请求聊天机器人不记住某些偏好
- 所有目标回复均避免敏感或隐私用户信息,即便用户在对话历史中泄露了此类信息
### 🗂️ 其他特性
- 我们区分了属于用户自身的偏好与涉及他人的偏好,例如与当前用户偏好无关的公开数学/编码基准测试问题。
- 随时间变化的偏好更新
- 所有对话环节均为多轮对话(2~5轮)
- 所有用户偏好与查询均标注了其类别。
- 总计包含1000个人像、26100条用户偏好、对话片段与问答对、325个对话主题,以及335个用户查询主题。
### 🎯 由GPT-5生成
- 每个人像、偏好、对话片段与问答均由[**GPT-5**](https://platform.openai.com/docs/models/gpt-5-chat-latest)生成(未使用自动路由模块),并经过一系列质量过滤流程以提升数据质量。
---
## 📂 数据集内容
### 核心文件
- **`benchmark/text/benchmark.csv`**
包含5000条用户查询作为基准测试数据。每条条目均关联其对应的聊天历史文件链接、目标回复、用户偏好、相关对话片段及其他元数据。**如需使用长上下文运行基准测试评估,请将用户查询追加至其聊天历史的末尾。** 每个上下文窗口最多包含32k或128k个仅文本的Token。
- **`benchmark/text/train.csv`**
包含18500条用户查询作为训练数据。其数据结构与`benchmark/text/benchmark.csv`一致,但`persona_id`无重叠。
- **`benchmark/text/val.csv`**
包含2600条用户查询作为验证数据。其数据结构与`benchmark/text/benchmark.csv`一致,但`persona_id`无重叠。
- **`benchmark/multimodal/benchmark.csv`**
包含5000条用户查询作为基准测试数据。其数据结构与`benchmark/text/benchmark.csv`一致,但聊天历史中同时包含图片与文本。
- **`benchmark/multimodal/train.csv`**
包含19200条用户查询作为训练数据。其数据结构与`benchmark/multimodal/benchmark.csv`一致,但`persona_id`无重叠。
- **`benchmark/multimodal/val.csv`**
包含2110条用户查询作为验证数据。其数据结构与`benchmark/multimodal/benchmark.csv`一致,但`persona_id`无重叠。
### 补充文件
- **`data/raw_data/*.json`**
包含1000个人像文件,每个文件均包含完整用户画像、用户偏好、与每条偏好相关的用户-聊天机器人对话片段、问答对及其他元数据。
- **`data/chat_history_32k/*.json`**
包含1000个聊天历史文件,每个文件均与`raw_data/`下对应的人像文件配对,还原该用户与聊天机器人的对话历史,最长可达32k个Token,所有历史均仅包含文本。
- **`data/chat_history_multimodal_32k/*.json`**
包含1000个聊天历史文件,每个文件均与`raw_data/`下对应的人像文件配对,还原该用户与聊天机器人的对话历史,最长可达32k个Token,对话历史可能包含base64编码的图片。
- **`data/chat_history_128k/*.json`**
包含1000个聊天历史文件,每个文件均与`raw_data/`下对应的人像文件配对,还原该用户与聊天机器人的对话历史,最长可达128k个Token,所有历史均仅包含文本。
- **`data/chat_history_multimodal_128k/*.json`**
包含1000个聊天历史文件,每个文件均与`raw_data/`下对应的人像文件配对,还原该用户与聊天机器人的对话历史,最长可达128k个Token,对话历史可能包含base64编码的图片。
- **`combined_irrelevant_data.json`**
包含1545条与任何人像均无关的用户-聊天机器人对话片段。每条片段均为多轮对话,聚焦于公开基准测试中的数学与编码问题,包括但不限于[GSM8K](https://huggingface.co/datasets/openai/gsm8k)、[Omni-Math](https://huggingface.co/datasets/KbsdJames/Omni-MATH)以及[BigCode-Bench](https://huggingface.co/datasets/bigcode/bigcodebench)。我们还模拟了多轮调试交互,可用于填充上下文窗口。
## 🧩 数据集结构示例
- **`benchmark.csv`**或**`benchmark_multimodal.csv`**的每一行均包含一套完整的列字段,以辅助您的使用与探索,详细说明请参见[column_descriptions.md](column_descriptions.md)。
- **`data/raw_data/*.json`**下的每个人像JSON文件均遵循以下结构。我们使用键名「preference」来表示用户偏好或人像特质。
json
"short_persona": "A science whiz who loves to incorporate physics concepts into unique toy car races",
"name": "Amara Nwosu",
"age": 28,
"gender": "Female",
"sexual_orientation": "Heterosexual",
"race_ethnicity": "Nigerian (Igbo) descent, Sub-Saharan African",
"nationality": "Nigerian-American",
"languages_spoken": [
"English (native fluency)",
"Igbo (conversational)",
"French (basic)"
],
......
[Unfixed sets of demographic attributes, such as name, gender, racial identities, occupation, education, etc]
......
[Unfixed sets of persona attributes]
......
"speaking_style_to_chatbot": {......},
"stereotypical_preferences": ["Loves spicy jollof rice and suya", ......],
"anti_stereotypical_preferences": ["Enjoys minimalist and Scandinavian-inspired home design", ......],
"neutral_preferences": ["Loves homemade pizza nights", ......],
"therapy_background": ["Difficulty setting boundaries with extended family members seeking financial support.", ......],
"health_and_medical_conditions": ["History of iron-deficiency anemia managed through diet and supplements", ......],
"sensitive_information": {
"physical_address": ......,
"llm_api_key": ......,
......
},
"matched_images": ["data/photobook_images/person_suitcase/COCO_train2014_000000538682.jpg", ......],
"preference_updates": {"Keeps up with Nollywood movie releases": "Does not keep up with Nollywood movie releases", ......},
"conversations": {
"personal_email": [
{
"preference": "Seasonal allergic rhinitis triggered by pollen",
"pref_type": "health_and_medical_conditions",
"who": "self",
"conversations": [
{
"role": "user",
"content": "Can you help improve the language in this email?
Subject: Workshop Scheduling for Spring STEM Program
Hello Dr. Ramirez,
I hope you're doing well. As we finalize the April schedule for the outdoor physics racetrack workshop, I wanted to suggest considering an alternative indoor space this year. The spring pollen levels around that time tend to get quite high in the sculpture garden, and it could make the session less comfortable for both me and some of the students.
An indoor setup would still allow us to incorporate the same educational activities — I can adjust the track layout so it maintains that exciting curve leading into a sharp acceleration zone. Plus, we can use the floor-to-ceiling windows to bring in plenty of natural light without the sneezing competition becoming a side event.
Let me know what you think, and I’d be happy to work with the facilities team to adapt the plan.
Warm regards,
Amara Nwosu
STEM Education Program Designer
Smithsonian Science Education Center"
},
{
"role": "assistant",
"content": "Subject: Spring STEM Workshop Location Discussion
Dear Dr. Ramirez,
I hope this message finds you well. As we finalize the April schedule for our outdoor physics racetrack workshop, I’d like to propose exploring an indoor venue for this year’s session. The sculpture garden can be beautiful in spring, but the elevated pollen levels at that time may cause discomfort for both myself and some students.
An indoor setting would allow us to deliver the same engaging activities. I can adapt the racetrack design to preserve the signature curve into the acceleration zone, and the floor-to-ceiling windows will still bring in abundant natural light—without the unintended “sneezing competition” on the sidelines.
Please let me know your thoughts. I’d be glad to coordinate with the facilities team to make any necessary adjustments.
Warm regards,
Amara Nwosu
STEM Education Program Designer
Smithsonian Science Education Center"
}
],
"updated": false,
"topic_preference": "health_and_medical",
"user_query": "What are some good ways to keep the air in my home fresh in the spring?",
"topic_query": "Health",
"correct_answer": "Since pollen makes your sinuses rebel like cars on a track full of speed bumps, try keeping windows shut during peak bloom hours and using a HEPA-filter air purifier to trap those microscopic troublemakers. You might even enjoy tracking daily pollen counts online like a science race scoreboard—low counts mean you can safely air things out. Adding a shallow dish of water near a sunny window can also subtly humidify and freshen the indoor atmosphere without letting the pollen in.",
"incorrect_answers": [
"If you’re sensitive to loud noises at night, think of your home like a racetrack where you control the acoustic drag—keep windows closed during busy traffic hours and use blackout curtains for sound insulation along with fresh air filtration. A quiet, steady-speed ceiling fan can circulate air without inviting in the street noise. For a spring twist, add a potted mint plant inside for a gentle, non-intrusive scent.",
"If you love filling your home with floral scents in spring, you could keep windows open in the mornings when city air tends to be cooler and calmer, then close them to preserve that fragrant air. A small oscillating fan can help waft those aromas gently through your apartment like a calm breeze over a garden path. For an extra sensory touch, place a few fresh-cut flowers in water near a sunbeam to play with both scent and light.",
"To keep your home’s air feeling fresh in spring, open windows during cooler, less polluted hours, then close them to maintain the clean air you’ve drawn in. Using an air purifier and adding houseplants can help freshen things further. A gentle fan or open interior doors will keep that freshness moving through your rooms."
]
},
......
],
"professional_email": [......],
"translation": [......],
"chat_message": [......],
"trouble_consult": [......],
"social_media_post": [......],
"knowledge_query": [......],
"multimodal": [......],
},
}
**引用**
如果您的研究受本工作启发,请考虑引用我们的成果。感谢您的支持!
bibtex
@article{jiang2025personamem,
title={PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory},
author={Jiang, Bowen and Yuan, Yuan and Shen, Maohao and Hao, Zhuoqun and Xu, Zhangchen and Chen, Zichen and Liu, Ziyi and Vijjini, Anvesh Rao and He, Jiashu and Yu, Hanchao and others},
journal={arXiv preprint arXiv:2512.06688},
year={2025}
}
@article{jiang2025know,
title={Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale},
author={Jiang, Bowen and Hao, Zhuoqun and Cho, Young-Min and Li, Bryan and Yuan, Yuan and Chen, Sihao and Ungar, Lyle and Taylor, Camillo J and Roth, Dan},
journal={arXiv preprint arXiv:2504.14225},
year={2025}
}
链接至[PersonaMem-v1](https://huggingface.co/datasets/bowen-upenn/PersonaMem)。
提供机构:
milanow



