---
license: apache-2.0
---
# Dataset Card
<!-- Provide a quick summary of the dataset. -->
HR-Multiwoz is a fully-labeled dataset of 550 conversations spanning 10 HR domains to evaluate LLM Agent. It is the first labeled open-sourced conversation dataset in the HR domain for NLP research.
Please refer to [HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent](https://arxiv.org/pdf/2402.01018.pdf) for details about the dataset construction.
### Dataset Description
- **Language(s) (NLP):** English
- **License:** MIT
### Dataset Sources
<!-- Provide the basic links for the dataset. -->
- **Repository:** [xwjzds/extractive_qa_question_answering_hr](https://huggingface.co/datasets/xwjzds/extractive_qa_question_answering_hr)
- **Paper:** [HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent](https://arxiv.org/pdf/2402.01018.pdf)
- **Leaderboard:** [github repo](https://github.com/amazon-science/hr-multiwoz-tod-llm-agent)
### Direct Use
<!-- This section describes suitable use cases for the dataset. -->
This dataset has been designed to evaluate transfer learning ability for TOD system and SGD algorithm.
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. -->
This dataset is not intended for use in training.
## Dataset Structure
### Data Instances
A typical data entry in the dataset consists of a the area of the conversation, turn_id, utterance, and state. Below is an example from the dataset:
```python
utterance = ["HR Assistant: No problem, tell me a bit about what kind of training you're looking for. What will you guys be learning about and do you need any special accommodations?", "Employee: We're hoping to improve our machine learning research skills. But no special accommodations are needed.", 'HR Assistant: Great, how many people will be attending this training? And approximately how long do you think it should last? ', "Employee: Three people will be participating and we'd like it to run for two days.", 'HR Assistant: Okay, thanks. Do you have a budget in mind for this? It would also help to know what your goals are for your team after the training.', 'Employee: Our budget is $5000. We want to come away with better research skills.', 'HR Assistant: No problem, when works best for your schedule? Also, do you have a preference for how the training is delivered, like online or in-person?', "Employee: We're hoping for some online seminars on June 15th, 2023.", "HR Assistant: Sounds good. One last thing - would you say your team's current skill level is beginner, intermediate or advanced? And can you provide the contact info for the person coordinating it all?", 'Employee: Their level is intermediate. The coordinator is Henry Kim and you can reach him at hkim@researchlab.com or 555-1234.']),
state = [{'training_topic': 'machine learning', 'special_requirements': 'no special accommodations are needed'}, {'number_of_participants': 'Three', 'training_duration': 'two days'}, {'budget_constraints': '$5000', 'desired_training_outcomes': 'better research skills'}, {'preferred_training_dates': 'June 15th, 2023', 'preferred_training_format': 'online seminars'}, {'current_skill_level': 'intermediate', 'contact_information': 'hkim@researchlab.com or 555-1234'}]
area = ['training_request']
```
### Data Fields
The dataset comprises the following fields:
-'dialogue_id': an int that is the id of the dialogue
-'service': a list contains relevant area of the dialogue
-'turn_id': a list of int which shows the turn id
-'speaker': a list of int where 0 represents HR Assistant and 1 represents Employee
-'utterance': a list of strings to represent the conversation
-'state': the state of entity and answer in each turn after Employee speak
## Dataset Creation
Please refer to [HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent](https://arxiv.org/pdf/2402.01018.pdf) for details about the dataset construction.
#### Who are the source data producers?
<!-- This section describes the people or systems who originally created the data. It should also include self-reported demographic or identity information for the source data creators if this information is available. -->
Not Amazon
#### Personal and Sensitive Information
<!-- State whether the dataset contains data that might be considered personal, sensitive, or private (e.g., data that reveals addresses, uniquely identifiable names or aliases, racial or ethnic origins, sexual orientations, religious beliefs, political opinions, financial or health data, etc.). If efforts were made to anonymize the data, describe the anonymization process. -->
None
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
This dataset is in English and contains systhetic problems.
## Citation
If you find this work useful in your method, you can cite the paper as below:
```
@inproceedings{xu-etal-2024-hr,
title = "{HR}-{M}ulti{WOZ}: A Task Oriented Dialogue ({TOD}) Dataset for {HR} {LLM} Agent",
author = "Xu, Weijie and
Huang, Zicheng and
Hu, Wenxiang and
Fang, Xi and
Cherukuri, Rajesh and
Nayyar, Naumaan and
Malandri, Lorenzo and
Sengamedu, Srinivasan",
editor = "Hruschka, Estevam and
Lake, Thom and
Otani, Naoki and
Mitchell, Tom",
booktitle = "Proceedings of the First Workshop on Natural Language Processing for Human Resources (NLP4HR 2024)",
month = mar,
year = "2024",
address = "St. Julian{'}s, Malta",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.nlp4hr-1.5",
pages = "59--72",
abstract = "Recent advancements in Large Language Models (LLMs) have been reshaping Natural Language Processing (NLP) task in several domains. Their use in the field of Human Resources (HR) has still room for expansions and could be beneficial for several time consuming tasks. Examples such as time-off submissions, medical claims filing, and access requests are noteworthy, but they are by no means the sole instances. However the aforementioned developments must grapple with the pivotal challenge of constructing a high-quality training dataset. On one hand, most conversation datasets are solving problems for customers not employees. On the other hand, gathering conversations with HR could raise privacy concerns. To solve it, we introduce HR-Multiwoz, a fully-labeled dataset of 550 conversations spanning 10 HR domains. Our work has the following contributions:(1) It is the first labeled open-sourced conversation dataset in the HR domain for NLP research. (2) It provides a detailed recipe for the data generation procedure along with data analysis and human evaluations. The data generation pipeline is transferrable and can be easily adapted for labeled conversation data generation in other domains. (3) The proposed data-collection pipeline is mostly based on LLMs with minimal human involvement for annotation, which is time and cost-efficient.",
}
```
---
许可证: Apache 2.0
---
# 数据集卡片
<!-- 简要概述该数据集。 -->
HR-Multiwoz是一个包含550段对话的全标注数据集,覆盖10个人力资源(Human Resources, HR)领域,用于评估AI智能体(AI Agent)。这是首个面向自然语言处理(Natural Language Processing, NLP)研究的开源人力资源领域标注对话数据集。
详细数据集构建方式请参阅论文《HR-MultiWOZ: 面向人力资源大语言模型智能体的任务型对话(Task Oriented Dialogue, TOD)数据集》(https://arxiv.org/pdf/2402.01018.pdf)。
### 数据集描述
- **语言(自然语言处理):** 英语
- **许可证:** MIT
### 数据集来源
<!-- 提供数据集的基础链接。 -->
- **代码仓库:** [xwjzds/extractive_qa_question_answering_hr](https://huggingface.co/datasets/xwjzds/extractive_qa_question_answering_hr)
- **论文:** [HR-MultiWOZ: 面向人力资源大语言模型智能体的任务型对话数据集](https://arxiv.org/pdf/2402.01018.pdf)
- **排行榜:** [GitHub仓库](https://github.com/amazon-science/hr-multiwoz-tod-llm-agent)
### 直接适用场景
<!-- 本部分描述该数据集的合适使用场景。 -->
本数据集旨在评估任务型对话系统与随机梯度下降(Stochastic Gradient Descent, SGD)算法的迁移学习能力。
### 不适用场景
<!-- 本部分说明误用、恶意使用以及该数据集无法良好适配的使用场景。 -->
本数据集不应用于训练场景。
## 数据集结构
### 数据实例
数据集中的典型条目包含对话领域、轮次ID、话语(utterance)与对话状态。以下为数据集中的一则示例:
python
utterance = ["HR专员:没问题,请详细说明您所需的培训类型。你们计划学习哪些内容,是否需要特殊适配安排?", "员工:我们希望提升机器学习研究技能,无需特殊适配安排。", 'HR专员:很好,本次培训将有多少人参加?预计时长大概多久? ', "员工:共有三人参与,希望培训时长为两天。", 'HR专员:好的,谢谢。请问你们是否有预算规划?另外了解一下培训后团队的目标也会对我们有所帮助。', '员工:我们的预算为5000美元,希望能提升团队的研究技能。', 'HR专员:没问题,请问您方日程中最合适的时间是什么时候?另外,对于培训形式,您是否有偏好,比如线上还是线下?', "员工:我们希望在2023年6月15日开展线上研讨会。", 'HR专员:听起来不错。最后一个问题——您认为团队当前的技能水平是入门、中级还是高级?另外能否提供统筹本次培训的联系人信息?', '员工:技能水平为中级。联系人是Henry Kim,邮箱为hkim@researchlab.com,电话为555-1234。']),
state = [{'培训主题': 'machine learning', '特殊需求': '无需特殊适配安排'}, {'参与人数': '三人', '培训时长': '两天'}, {'预算限制': '5000美元', '预期培训成果': '提升研究技能'}, {'偏好培训日期': '2023年6月15日', '偏好培训形式': '线上研讨会'}, {'当前技能水平': '中级', '联系方式': 'hkim@researchlab.com 或 555-1234'}]
area = ['培训需求']
### 数据字段
本数据集包含以下字段:
- `dialogue_id`:表示对话ID的整数
- `service`:包含对话相关领域的列表
- `turn_id`:表示轮次ID的整数列表
- `speaker`:整数列表,其中0代表人力资源专员,1代表员工
- `utterance`:表示对话内容的字符串列表
- `state`:员工发言后各轮次的实体与应答状态
## 数据集构建
详细数据集构建方式请参阅论文《HR-MultiWOZ: 面向人力资源大语言模型智能体的任务型对话数据集》(https://arxiv.org/pdf/2402.01018.pdf)。
#### 数据源生产者是谁?
<!-- 本部分描述原始创建数据的个人或系统。若有相关信息,还应包含数据源创建者自行报告的人口统计或身份信息。 -->
非亚马逊
#### 个人与敏感信息
<!-- 说明数据集是否包含可被视为个人、敏感或隐私的数据(例如泄露地址、唯一可识别的姓名或别名、种族或族裔出身、性取向、宗教信仰、政治观点、财务或健康数据等)。若已采取数据匿名化措施,请描述匿名化流程。 -->
无
## 偏差、风险与局限性
<!-- 本部分旨在说明技术与社会技术层面的局限性。 -->
本数据集为英语语料,且包含合成生成的对话问题。
## 引用
若您的研究中使用本数据集有所裨益,请按以下格式引用:
@inproceedings{xu-etal-2024-hr,
title = "{HR}-{M}ulti{WOZ}: A Task Oriented Dialogue ({TOD}) Dataset for {HR} {LLM} Agent",
author = "Xu, Weijie and
Huang, Zicheng and
Hu, Wenxiang and
Fang, Xi and
Cherukuri, Rajesh and
Nayyar, Naumaan and
Malandri, Lorenzo and
Sengamedu, Srinivasan",
editor = "Hruschka, Estevam and
Lake, Thom and
Otani, Naoki and
Mitchell, Tom",
booktitle = "Proceedings of the First Workshop on Natural Language Processing for Human Resources (NLP4HR 2024)",
month = mar,
year = "2024",
address = "St. Julian's, Malta",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.nlp4hr-1.5",
pages = "59--72",
abstract = "近年来,大语言模型(Large Language Model, LLM)的发展正在重塑多个领域的自然语言处理(Natural Language Processing, NLP)任务。其在人力资源(Human Resources, HR)领域的应用仍有拓展空间,可助力多项耗时任务的自动化,例如休假申请、医疗理赔提交与权限申请等,但此类场景并非全部。然而,上述发展面临的关键挑战之一是构建高质量的训练数据集。一方面,现有多数对话数据集面向客户而非员工场景;另一方面,收集人力资源相关对话可能引发隐私顾虑。为此,我们推出HR-Multiwoz——一个覆盖10个人力资源领域、包含550段全标注对话的数据集。本工作的贡献如下:(1) 这是首个面向自然语言处理研究的开源人力资源领域标注对话数据集。(2) 我们提供了详细的数据生成流程方案,辅以数据分析与人工评估。该数据生成流水线(Pipeline)具有可迁移性,可轻松适配其他领域的标注对话数据生成。(3) 所提出的数据收集流水线主要基于大语言模型构建,仅需极少人工参与标注,具备时间与成本效率优势。",
}