five

xwjzds/hr_multiwoz_tod_sgd

收藏
Hugging Face2024-03-19 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/xwjzds/hr_multiwoz_tod_sgd
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 --- # Dataset Card <!-- Provide a quick summary of the dataset. --> HR-Multiwoz is a fully-labeled dataset of 550 conversations spanning 10 HR domains to evaluate LLM Agent. It is the first labeled open-sourced conversation dataset in the HR domain for NLP research. Please refer to [HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent](https://arxiv.org/pdf/2402.01018.pdf) for details about the dataset construction. ### Dataset Description - **Language(s) (NLP):** English - **License:** MIT ### Dataset Sources <!-- Provide the basic links for the dataset. --> - **Repository:** [xwjzds/extractive_qa_question_answering_hr](https://huggingface.co/datasets/xwjzds/extractive_qa_question_answering_hr) - **Paper:** [HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent](https://arxiv.org/pdf/2402.01018.pdf) - **Leaderboard:** [github repo](https://github.com/amazon-science/hr-multiwoz-tod-llm-agent) ### Direct Use <!-- This section describes suitable use cases for the dataset. --> This dataset has been designed to evaluate transfer learning ability for TOD system and SGD algorithm. ### Out-of-Scope Use <!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. --> This dataset is not intended for use in training. ## Dataset Structure ### Data Instances A typical data entry in the dataset consists of a the area of the conversation, turn_id, utterance, and state. Below is an example from the dataset: ```python utterance = ["HR Assistant: No problem, tell me a bit about what kind of training you're looking for. What will you guys be learning about and do you need any special accommodations?", "Employee: We're hoping to improve our machine learning research skills. But no special accommodations are needed.", 'HR Assistant: Great, how many people will be attending this training? And approximately how long do you think it should last? ', "Employee: Three people will be participating and we'd like it to run for two days.", 'HR Assistant: Okay, thanks. Do you have a budget in mind for this? It would also help to know what your goals are for your team after the training.', 'Employee: Our budget is $5000. We want to come away with better research skills.', 'HR Assistant: No problem, when works best for your schedule? Also, do you have a preference for how the training is delivered, like online or in-person?', "Employee: We're hoping for some online seminars on June 15th, 2023.", "HR Assistant: Sounds good. One last thing - would you say your team's current skill level is beginner, intermediate or advanced? And can you provide the contact info for the person coordinating it all?", 'Employee: Their level is intermediate. The coordinator is Henry Kim and you can reach him at hkim@researchlab.com or 555-1234.']), state = [{'training_topic': 'machine learning', 'special_requirements': 'no special accommodations are needed'}, {'number_of_participants': 'Three', 'training_duration': 'two days'}, {'budget_constraints': '$5000', 'desired_training_outcomes': 'better research skills'}, {'preferred_training_dates': 'June 15th, 2023', 'preferred_training_format': 'online seminars'}, {'current_skill_level': 'intermediate', 'contact_information': 'hkim@researchlab.com or 555-1234'}] area = ['training_request'] ``` ### Data Fields The dataset comprises the following fields: -'dialogue_id': an int that is the id of the dialogue -'service': a list contains relevant area of the dialogue -'turn_id': a list of int which shows the turn id -'speaker': a list of int where 0 represents HR Assistant and 1 represents Employee -'utterance': a list of strings to represent the conversation -'state': the state of entity and answer in each turn after Employee speak ## Dataset Creation Please refer to [HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent](https://arxiv.org/pdf/2402.01018.pdf) for details about the dataset construction. #### Who are the source data producers? <!-- This section describes the people or systems who originally created the data. It should also include self-reported demographic or identity information for the source data creators if this information is available. --> Not Amazon #### Personal and Sensitive Information <!-- State whether the dataset contains data that might be considered personal, sensitive, or private (e.g., data that reveals addresses, uniquely identifiable names or aliases, racial or ethnic origins, sexual orientations, religious beliefs, political opinions, financial or health data, etc.). If efforts were made to anonymize the data, describe the anonymization process. --> None ## Bias, Risks, and Limitations <!-- This section is meant to convey both technical and sociotechnical limitations. --> This dataset is in English and contains systhetic problems. ## Citation If you find this work useful in your method, you can cite the paper as below: ``` @inproceedings{xu-etal-2024-hr, title = "{HR}-{M}ulti{WOZ}: A Task Oriented Dialogue ({TOD}) Dataset for {HR} {LLM} Agent", author = "Xu, Weijie and Huang, Zicheng and Hu, Wenxiang and Fang, Xi and Cherukuri, Rajesh and Nayyar, Naumaan and Malandri, Lorenzo and Sengamedu, Srinivasan", editor = "Hruschka, Estevam and Lake, Thom and Otani, Naoki and Mitchell, Tom", booktitle = "Proceedings of the First Workshop on Natural Language Processing for Human Resources (NLP4HR 2024)", month = mar, year = "2024", address = "St. Julian{'}s, Malta", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.nlp4hr-1.5", pages = "59--72", abstract = "Recent advancements in Large Language Models (LLMs) have been reshaping Natural Language Processing (NLP) task in several domains. Their use in the field of Human Resources (HR) has still room for expansions and could be beneficial for several time consuming tasks. Examples such as time-off submissions, medical claims filing, and access requests are noteworthy, but they are by no means the sole instances. However the aforementioned developments must grapple with the pivotal challenge of constructing a high-quality training dataset. On one hand, most conversation datasets are solving problems for customers not employees. On the other hand, gathering conversations with HR could raise privacy concerns. To solve it, we introduce HR-Multiwoz, a fully-labeled dataset of 550 conversations spanning 10 HR domains. Our work has the following contributions:(1) It is the first labeled open-sourced conversation dataset in the HR domain for NLP research. (2) It provides a detailed recipe for the data generation procedure along with data analysis and human evaluations. The data generation pipeline is transferrable and can be easily adapted for labeled conversation data generation in other domains. (3) The proposed data-collection pipeline is mostly based on LLMs with minimal human involvement for annotation, which is time and cost-efficient.", } ```

--- 许可证: Apache 2.0 --- # 数据集卡片 <!-- 简要概述该数据集。 --> HR-Multiwoz是一个包含550段对话的全标注数据集,覆盖10个人力资源(Human Resources, HR)领域,用于评估AI智能体(AI Agent)。这是首个面向自然语言处理(Natural Language Processing, NLP)研究的开源人力资源领域标注对话数据集。 详细数据集构建方式请参阅论文《HR-MultiWOZ: 面向人力资源大语言模型智能体的任务型对话(Task Oriented Dialogue, TOD)数据集》(https://arxiv.org/pdf/2402.01018.pdf)。 ### 数据集描述 - **语言(自然语言处理):** 英语 - **许可证:** MIT ### 数据集来源 <!-- 提供数据集的基础链接。 --> - **代码仓库:** [xwjzds/extractive_qa_question_answering_hr](https://huggingface.co/datasets/xwjzds/extractive_qa_question_answering_hr) - **论文:** [HR-MultiWOZ: 面向人力资源大语言模型智能体的任务型对话数据集](https://arxiv.org/pdf/2402.01018.pdf) - **排行榜:** [GitHub仓库](https://github.com/amazon-science/hr-multiwoz-tod-llm-agent) ### 直接适用场景 <!-- 本部分描述该数据集的合适使用场景。 --> 本数据集旨在评估任务型对话系统与随机梯度下降(Stochastic Gradient Descent, SGD)算法的迁移学习能力。 ### 不适用场景 <!-- 本部分说明误用、恶意使用以及该数据集无法良好适配的使用场景。 --> 本数据集不应用于训练场景。 ## 数据集结构 ### 数据实例 数据集中的典型条目包含对话领域、轮次ID、话语(utterance)与对话状态。以下为数据集中的一则示例: python utterance = ["HR专员:没问题,请详细说明您所需的培训类型。你们计划学习哪些内容,是否需要特殊适配安排?", "员工:我们希望提升机器学习研究技能,无需特殊适配安排。", 'HR专员:很好,本次培训将有多少人参加?预计时长大概多久? ', "员工:共有三人参与,希望培训时长为两天。", 'HR专员:好的,谢谢。请问你们是否有预算规划?另外了解一下培训后团队的目标也会对我们有所帮助。', '员工:我们的预算为5000美元,希望能提升团队的研究技能。', 'HR专员:没问题,请问您方日程中最合适的时间是什么时候?另外,对于培训形式,您是否有偏好,比如线上还是线下?', "员工:我们希望在2023年6月15日开展线上研讨会。", 'HR专员:听起来不错。最后一个问题——您认为团队当前的技能水平是入门、中级还是高级?另外能否提供统筹本次培训的联系人信息?', '员工:技能水平为中级。联系人是Henry Kim,邮箱为hkim@researchlab.com,电话为555-1234。']), state = [{'培训主题': 'machine learning', '特殊需求': '无需特殊适配安排'}, {'参与人数': '三人', '培训时长': '两天'}, {'预算限制': '5000美元', '预期培训成果': '提升研究技能'}, {'偏好培训日期': '2023年6月15日', '偏好培训形式': '线上研讨会'}, {'当前技能水平': '中级', '联系方式': 'hkim@researchlab.com 或 555-1234'}] area = ['培训需求'] ### 数据字段 本数据集包含以下字段: - `dialogue_id`:表示对话ID的整数 - `service`:包含对话相关领域的列表 - `turn_id`:表示轮次ID的整数列表 - `speaker`:整数列表,其中0代表人力资源专员,1代表员工 - `utterance`:表示对话内容的字符串列表 - `state`:员工发言后各轮次的实体与应答状态 ## 数据集构建 详细数据集构建方式请参阅论文《HR-MultiWOZ: 面向人力资源大语言模型智能体的任务型对话数据集》(https://arxiv.org/pdf/2402.01018.pdf)。 #### 数据源生产者是谁? <!-- 本部分描述原始创建数据的个人或系统。若有相关信息,还应包含数据源创建者自行报告的人口统计或身份信息。 --> 非亚马逊 #### 个人与敏感信息 <!-- 说明数据集是否包含可被视为个人、敏感或隐私的数据(例如泄露地址、唯一可识别的姓名或别名、种族或族裔出身、性取向、宗教信仰、政治观点、财务或健康数据等)。若已采取数据匿名化措施,请描述匿名化流程。 --> 无 ## 偏差、风险与局限性 <!-- 本部分旨在说明技术与社会技术层面的局限性。 --> 本数据集为英语语料,且包含合成生成的对话问题。 ## 引用 若您的研究中使用本数据集有所裨益,请按以下格式引用: @inproceedings{xu-etal-2024-hr, title = "{HR}-{M}ulti{WOZ}: A Task Oriented Dialogue ({TOD}) Dataset for {HR} {LLM} Agent", author = "Xu, Weijie and Huang, Zicheng and Hu, Wenxiang and Fang, Xi and Cherukuri, Rajesh and Nayyar, Naumaan and Malandri, Lorenzo and Sengamedu, Srinivasan", editor = "Hruschka, Estevam and Lake, Thom and Otani, Naoki and Mitchell, Tom", booktitle = "Proceedings of the First Workshop on Natural Language Processing for Human Resources (NLP4HR 2024)", month = mar, year = "2024", address = "St. Julian's, Malta", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.nlp4hr-1.5", pages = "59--72", abstract = "近年来,大语言模型(Large Language Model, LLM)的发展正在重塑多个领域的自然语言处理(Natural Language Processing, NLP)任务。其在人力资源(Human Resources, HR)领域的应用仍有拓展空间,可助力多项耗时任务的自动化,例如休假申请、医疗理赔提交与权限申请等,但此类场景并非全部。然而,上述发展面临的关键挑战之一是构建高质量的训练数据集。一方面,现有多数对话数据集面向客户而非员工场景;另一方面,收集人力资源相关对话可能引发隐私顾虑。为此,我们推出HR-Multiwoz——一个覆盖10个人力资源领域、包含550段全标注对话的数据集。本工作的贡献如下:(1) 这是首个面向自然语言处理研究的开源人力资源领域标注对话数据集。(2) 我们提供了详细的数据生成流程方案,辅以数据分析与人工评估。该数据生成流水线(Pipeline)具有可迁移性,可轻松适配其他领域的标注对话数据生成。(3) 所提出的数据收集流水线主要基于大语言模型构建,仅需极少人工参与标注,具备时间与成本效率优势。", }
提供机构:
xwjzds
原始信息汇总

数据集概述

数据集描述

  • 名称: HR-Multiwoz
  • 语言: 英语
  • 许可证: MIT
  • 描述: HR-Multiwoz 是一个包含550个对话的完全标注数据集,涵盖10个HR领域,用于评估LLM代理。这是第一个在HR领域为NLP研究提供的开源标注对话数据集。

数据来源

数据集用途

  • 直接用途: 该数据集设计用于评估TOD系统和SGD算法的迁移学习能力。
  • 非适用用途: 该数据集不适用于训练。

数据集结构

数据实例

一个典型的数据条目包括对话区域、turn_id、话语和状态。以下是数据集中的一个示例:

python utterance = ["HR Assistant: No problem, tell me a bit about what kind of training youre looking for. What will you guys be learning about and do you need any special accommodations?", "Employee: Were hoping to improve our machine learning research skills. But no special accommodations are needed.", HR Assistant: Great, how many people will be attending this training? And approximately how long do you think it should last? , "Employee: Three people will be participating and wed like it to run for two days.", HR Assistant: Okay, thanks. Do you have a budget in mind for this? It would also help to know what your goals are for your team after the training., Employee: Our budget is $5000. We want to come away with better research skills., HR Assistant: No problem, when works best for your schedule? Also, do you have a preference for how the training is delivered, like online or in-person?, "Employee: Were hoping for some online seminars on June 15th, 2023.", "HR Assistant: Sounds good. One last thing - would you say your teams current skill level is beginner, intermediate or advanced? And can you provide the contact info for the person coordinating it all?", Employee: Their level is intermediate. The coordinator is Henry Kim and you can reach him at hkim@researchlab.com or 555-1234.]), state = [{training_topic: machine learning, special_requirements: no special accommodations are needed}, {number_of_participants: Three, training_duration: two days}, {budget_constraints: $5000, desired_training_outcomes: better research skills}, {preferred_training_dates: June 15th, 2023, preferred_training_format: online seminars}, {current_skill_level: intermediate, contact_information: hkim@researchlab.com or 555-1234}] area = [training_request]

数据字段

数据集包含以下字段:

  • dialogue_id:对话的ID,一个整数。
  • service:包含对话相关领域的列表。
  • turn_id:显示轮次ID的整数列表。
  • speaker:表示说话者的整数列表,0代表HR助理,1代表员工。
  • utterance:表示对话的字符串列表。
  • state:员工发言后每个轮次的实体和答案状态。

数据集创建

详细的数据集构建过程请参考论文 HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent

引用

如果您的研究中使用了该数据集,请按以下方式引用:

@inproceedings{xu-etal-2024-hr, title = "{HR}-{M}ulti{WOZ}: A Task Oriented Dialogue ({TOD}) Dataset for {HR} {LLM} Agent", author = "Xu, Weijie and Huang, Zicheng and Hu, Wenxiang and Fang, Xi and Cherukuri, Rajesh and Nayyar, Naumaan and Malandri, Lorenzo and Sengamedu, Srinivasan", editor = "Hruschka, Estevam and Lake, Thom and Otani, Naoki and Mitchell, Tom", booktitle = "Proceedings of the First Workshop on Natural Language Processing for Human Resources (NLP4HR 2024)", month = mar, year = "2024", address = "St. Julian{}s, Malta", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.nlp4hr-1.5", pages = "59--72", abstract = "Recent advancements in Large Language Models (LLMs) have been reshaping Natural Language Processing (NLP) task in several domains. Their use in the field of Human Resources (HR) has still room for expansions and could be beneficial for several time consuming tasks. Examples such as time-off submissions, medical claims filing, and access requests are noteworthy, but they are by no means the sole instances. However the aforementioned developments must grapple with the pivotal challenge of constructing a high-quality training dataset. On one hand, most conversation datasets are solving problems for customers not employees. On the other hand, gathering conversations with HR could raise privacy concerns. To solve it, we introduce HR-Multiwoz, a fully-labeled dataset of 550 conversations spanning 10 HR domains. Our work has the following contributions:(1) It is the first labeled open-sourced conversation dataset in the HR domain for NLP research. (2) It provides a detailed recipe for the data generation procedure along with data analysis and human evaluations. The data generation pipeline is transferrable and can be easily adapted for labeled conversation data generation in other domains. (3) The proposed data-collection pipeline is mostly based on LLMs with minimal human involvement for annotation, which is time and cost-efficient.", }

搜集汇总
数据集介绍
main_image_url
构建方式
HR-Multiwoz数据集的构建,是通过模拟人力资源领域中的对话场景,生成550个涵盖10个人力资源领域的完全标注对话。该数据集的构建采用了基于大型语言模型(LLM)的数据生成管道,实现了高效、低成本的数据标注,最小化人工参与的同时,确保了数据的标注质量。
特点
HR-Multiwoz数据集作为首个人力资源领域的开源对话数据集,其特点在于对话内容的全面标注,以及对话场景的多样性。数据集以英语为语言,涵盖了招聘、培训、绩效评估等多个HR子领域,为NLP研究在人力资源领域的应用提供了宝贵资源。此外,数据集的结构化设计使得对话的状态跟踪变得可行,有利于模型的训练和评估。
使用方法
使用HR-Multiwoz数据集时,研究者可以将其作为任务导向对话(TOD)系统及SGD算法的评估工具。数据集可以直接从Hugging Face的仓库中加载,并通过其提供的字段如对话ID、服务领域、对话轮次、参与者身份、对话内容和状态信息等,进行数据分析和模型训练。为了遵循使用规范,该数据集不应被用于训练目的,而应主要用于模型评估和性能分析。
背景与挑战
背景概述
HR-Multiwoz数据集,由Xu Weijie等研究人员于2024年创建,是首个人力资源(HR)领域中面向自然语言处理(NLP)研究的全标注对话数据集。该数据集包含550个对话,覆盖了10个人力资源领域,旨在评估大型语言模型(LLM)代理在任务导向对话(TOD)系统中的迁移学习能力和SGD算法。HR-Multiwoz的构建为NLP在人力资源领域的应用提供了重要基准,对推动相关研究和技术进步具有显著影响力。
当前挑战
HR-Multiwoz数据集在构建过程中面临的挑战主要包括:确保数据质量与标注一致性,特别是在处理敏感的人力资源信息时;平衡数据集中不同HR领域的代表性,以适应多样化的应用场景;以及设计高效的数据生成流程,降低人力成本和时间消耗。此外,该数据集在应用层面也面临挑战,如如何准确评估LLM代理在真实世界HR场景中的性能,以及如何处理数据中的潜在偏见和风险。
常用场景
经典使用场景
在自然语言处理领域,HR-Multiwoz数据集被广泛用于评估任务导向对话(TOD)系统的迁移学习能力和SGD算法。该数据集通过模拟人力资源领域的对话,提供了550个完整标注的会话,涵盖10个人力资源子领域,为研究人员和开发者提供了一种评估和改进其对话系统性能的可靠方式。
解决学术问题
HR-Multiwoz数据集解决了人力资源领域缺乏高质量对话数据集的问题,为学术研究提供了首个开源的、完全标注的对话数据集。它不仅有助于推动LLM代理在HR领域的应用研究,而且为构建和评估对话系统的数据生成流程提供了详尽的参考,从而促进了自然语言处理技术在人力资源管理任务中的发展。
衍生相关工作
HR-Multiwoz数据集的发布促进了相关领域的研究工作,如对话系统的数据生成、对话状态跟踪以及多轮对话管理等。基于该数据集,研究者们不仅开发出了更加精确的对话模型,还进一步探索了LLM在人力资源管理等专业领域的应用可能性,推动了自然语言处理技术在实践中的应用边界。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作