huawei-noah/CHARP

Name: huawei-noah/CHARP
Creator: huawei-noah
Published: 2025-07-24 11:25:25
License: 暂无描述

Hugging Face2025-07-24 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/huawei-noah/CHARP

下载链接

链接失效反馈

官方服务：

资源简介：

`CHARP`是一个诊断测试平台，专门用于评估信息寻求对话系统是否有效地关注和使用对话历史。该数据集通过修改`FaithDial`验证集中的示例来构建，以确保与`FaithDial`的最大领域对齐，并最小化注释成本。数据集包含两个子集：`eCHARP`和`hCHARP`，分别对应不需要和需要推理对话历史的示例。数据集共包含2160个示例，每个子集各1080个。数据字段包括`row_idx`、`history`、`knowledge`和`response`。

提供机构：

huawei-noah

原始信息汇总

数据集概述

CHARP 是一个诊断测试平台，专门评估信息寻求对话系统是否有效地关注和使用对话历史。CHARP 是通过修改 FaithDial 验证集中的示例构建的，以确保最大限度地与 FaithDial 领域对齐并最小化标注成本。CHARP 包含两个子集，仅最后一个寻求者的话语不同：一个自包含的简单版本（eCHARP），和一个困难版本（hCHARP），后者需要基于对话历史和提供的知识进行推理。

数据分割

我们创建了两个版本的 CHARP：hCHARP 用于需要基于对话历史进行推理的示例，eCHARP 用于无需此类推理的示例。我们标注了 FaithDial 验证集的 42%（排除没有对话历史的示例后）。CHARP 包含 2,160 个示例，eCHARP 和 hCHARP 各占一半：

eCHARP: 1080 个样本
hCHARP: 1080 个样本

数据字段

eCHARP 和 hCHARP 具有相同的数据格式：

row_idx: int。样本索引，与 FaithDial 验证集中的索引相同。
history: List[string]。对话历史。
knowledge: string。机器人应基于其响应的源知识。
response: string。预期的模型响应。

数据实例

eCHARP 的一个示例如下：

json { "row_idx": "1293", "history": [ "I love watching and playing basketball.", "I see. Have you ever tried to describe basketball? I would say it is a low contact sport where the game is held in a rectangular court.", "Yeah I never though of that, can you repeat what you told me again so I can take notes?", "Yes I can, basketball is a sport with limited contact. It is held on a rectangular like court.", "What would you describe the sport is played like?", "The objective for basketball is shooting the ball into the hoops. The hoops are high and placed with a backboard on each side of the court.", "Oh yea, thats pretty simple. Do you know any famous basketball courts?" ], "knowledge": "Supreme Court in the USA is very famous to have well-known judges, while the Philippine Arena is popular due to the size of the basketball court.", "response": "Ah yeah, I heard that the Philippine Arena is popular because of the size of the basketball court." }

hCHARP 的一个示例如下：

json { "row_idx": "1293", "history": [ "I love watching and playing basketball.", "I see. Have you ever tried to describe basketball? I would say it is a low contact sport where the game is held in a rectangular court.", "Yeah I never though of that, can you repeat what you told me again so I can take notes?", "Yes I can, basketball is a sport with limited contact. It is held on a rectangular like court.", "What would you describe the sport is played like?", "The objective for basketball is shooting the ball into the hoops. The hoops are high and placed with a backboard on each side of the court.", "Oh yea, thats pretty simple. Do you know any famous courts?" ], "knowledge": "Supreme Court in the USA is very famous to have well-known judges, while the Philippine Arena is popular due to the size of the basketball court.", "response": "Ah yeah, I heard that the Philippine Arena is popular because of the size of the basketball court." }

搜集汇总

数据集介绍

构建方式

CHARP数据集的构建基于FaithDial验证集的修改，旨在评估信息寻求对话系统是否能有效利用对话历史。通过编辑FaithDial的示例，使其响应依赖于对话历史，CHARP确保了与FaithDial的最大领域对齐，同时最小化了注释成本。数据集包含两个子集：eCHARP（简单版）和hCHARP（困难版），分别对应于无需和需要推理对话历史的场景。

特点

CHARP数据集的显著特点在于其双层结构，分别针对不同难度的对话生成任务。eCHARP子集设计为自包含，适合基础对话生成训练；而hCHARP子集则要求模型基于对话历史和提供的知识进行推理，更具挑战性。这种设计使得CHARP成为评估和提升对话系统性能的理想工具。

使用方法

CHARP数据集适用于文本生成和对话建模任务，特别关注对话系统对历史信息的处理能力。使用者可以通过加载数据集中的eCHARP和hCHARP子集，分别进行基础和高级对话生成模型的训练与评估。数据集提供了清晰的字段结构，包括对话历史、知识源和预期响应，便于直接应用于各类对话生成模型。

背景与挑战

背景概述

在信息寻求对话系统领域，`CHARP`数据集作为诊断测试平台，旨在评估系统是否能够有效利用对话历史。该数据集由华为诺亚方舟实验室（Huawei Noah's Ark Lab）的研究人员创建，主要研究人员包括Abbas Ghaddar、David Alfonso-Hermelo、Philippe Langlais、Mehdi Rezagholizadeh、Boxing Chen和Prasanna Parthasarathi。`CHARP`通过对[FaithDial](https://huggingface.co/datasets/McGill-NLP/FaithDial)验证集的修改构建，确保与FaithDial领域的高度一致性，同时降低标注成本。其核心研究问题聚焦于对话系统对历史信息的依赖性和利用效率，对提升对话系统的自然性和连贯性具有重要影响。

当前挑战

`CHARP`数据集在构建过程中面临多项挑战。首先，确保数据集与FaithDial的高度领域一致性，同时降低标注成本，是一项复杂任务。其次，数据集分为两个子集：`eCHARP`和`hCHARP`，分别代表无需和需要推理对话历史的难度级别，这要求在数据标注和处理过程中保持高度的准确性和一致性。此外，评估对话系统对历史信息的依赖性，需要设计有效的评估指标和方法，以确保测试结果的可靠性和有效性。这些挑战共同构成了`CHARP`数据集在推动对话系统研究中的重要课题。

常用场景

经典使用场景

在信息寻求对话系统领域，CHARP数据集被广泛用于评估系统是否能够有效利用对话历史。该数据集通过修改FaithDial验证集中的示例，确保其与FaithDial领域高度一致，从而减少注释成本。CHARP包含两个子集：eCHARP（简单版）和hCHARP（困难版），分别用于测试系统在无需和需要推理对话历史情况下的表现。

解决学术问题

CHARP数据集解决了对话系统中常见的学术问题，即系统是否能够正确理解和利用对话历史。通过提供不同难度的对话示例，CHARP帮助研究人员评估和改进对话系统在处理复杂对话时的表现，从而推动了知识驱动对话系统的发展。

衍生相关工作

基于CHARP数据集，研究人员开发了多种改进对话系统的方法，包括增强对话历史理解和推理能力的模型。此外，CHARP还启发了其他相关数据集的创建，如针对特定领域对话历史的评估数据集，进一步推动了对话系统领域的研究和发展。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集