five

cmalaviya/expertqa

收藏
Hugging Face2023-10-07 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/cmalaviya/expertqa
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: main data_files: r2_compiled_anon_fixed.jsonl - config_name: lfqa_random data_files: - split: train path: rand_lfqa_train.json - split: test path: rand_lfqa_test.json - split: validation path: rand_lfqa_val.json - config_name: lfqa_domain data_files: - split: train path: domain_lfqa_train.json - split: test path: domain_lfqa_test.json - split: validation path: domain_lfqa_val.json license: mit task_categories: - question-answering language: - en source_datasets: - original pretty_name: ExpertQA annotations_creators: - expert-generated size_categories: - 1K<n<10K --- # Dataset Card for ExpertQA ## Dataset Description - **Repository: https://github.com/chaitanyamalaviya/ExpertQA** - **Paper: https://arxiv.org/pdf/2309.07852** - **Point of Contact: chaitanyamalaviya@gmail.com** ### Dataset Summary We provide here the data accompanying the paper: [ExpertQA: Expert-Curated Questions and Attributed Answers](https://arxiv.org/pdf/2309.07852). The ExpertQA dataset contains 2177 examples from 32 different fields. ### Supported Tasks The `main` data contains 2177 examples that can be used to evaluate new methods for estimating factuality and attribution, while the `lfqa_domain` and `lfqa_rand` data can be used to evaluate long-form question answering systems. ## Dataset Creation ### Curation Rationale ExpertQA was created to evaluate factuality & attribution in language model responses to domain-specific questions, as well as evaluate long-form question answering in domain-specific settings. ### Annotation Process Questions in ExpertQA were formulated by experts spanning 32 fields. The answers to these questions are expert-verified, model-generated answers to these questions. Each claim-evidence pair in an answer is judged by experts for various properties such as the claim’s informativeness, factuality, citeworthiness, whether the claim is supported by the evidence, and reliability of the evidence source. Further, experts revise the original claims to ensure they are factual and supported by trustworthy sources. ## Dataset Structure ### Data Instances We provide the main data, with judgements of factuality and attribution, under the `default` subset. The long-form QA data splits are provided at `lfqa_domain` (domain split) and `lfqa_rand` (random split). Additional files are provided in our [GitHub repo](https://github.com/chaitanyamalaviya/ExpertQA). ### Data Fields The main data file contains newline-separated json dictionaries with the following fields: * `question` - Question written by an expert. * `annotator_id` - Anonymized annotator ID of the author of the question. * `answers` - Dict mapping model names to an Answer object. The model names can be one of `{gpt4, bing_chat, rr_sphere_gpt4, rr_gs_gpt4, post_hoc_sphere_gpt4, post_hoc_gs_gpt4}`. * `metadata` - A dictionary with the following fields: * `question_type` - The question type(s) separated by "|". * `field` - The field to which the annotator belonged. * `specific_field` - More specific field name within the broader field. Each Answer object contains the following fields: * `answer_string`: The answer string. * `attribution`: List of evidences for the answer (not linked to specific claims). Note that these are only URLs, the evidence passages are stored in the Claim object -- see below. * `claims`: List of Claim objects for the answer. * `revised_answer_string`: Revised answer by annotator. * `usefulness`: Usefulness of original answer marked by annotator. * `annotation_time`: Time taken for annotating this answer. * `annotator_id`: Anonymized annotator ID of the person who validated this answer. Each Claim object contains the following fields: * `claim_string`: Original claim string. * `evidence`: List of evidences for the claim (URL+passage or URL). * `support`: Attribution marked by annotator. * `reason_missing_support`: Reason for missing support specified by annotator. * `informativeness`: Informativeness of claim for the question, marked by annotator. * `worthiness`: Worthiness of citing claim marked by annotator. * `correctness`: Factual correctness of claim marked by annotator. * `reliability`: Reliability of source evidence marked by annotator. * `revised_claim`: Revised claim by annotator. * `revised_evidence`: Revised evidence by annotator. ### Citation Information ``` @inproceedings{malaviya23expertqa, title = {ExpertQA: Expert-Curated Questions and Attributed Answers}, author = {Chaitanya Malaviya and Subin Lee and Sihao Chen and Elizabeth Sieber and Mark Yatskar and Dan Roth}, booktitle = {arXiv}, month = {September}, year = {2023}, url = "https://arxiv.org/abs/2309.07852" } ```
提供机构:
cmalaviya
原始信息汇总

数据集卡片 for ExpertQA

数据集描述

数据集概述

ExpertQA 数据集包含 2177 个来自 32 个不同领域的示例,与论文 ExpertQA: Expert-Curated Questions and Attributed Answers 相关。

支持的任务

  • main 数据包含 2177 个示例,用于评估新方法来估计事实性和归属性。
  • lfqa_domainlfqa_rand 数据用于评估领域特定设置中的长形式问答系统。

数据集创建

策划理由

ExpertQA 旨在评估语言模型对领域特定问题的响应中的事实性和归属性,以及在领域特定设置中评估长形式问答。

标注过程

ExpertQA 中的问题由 32 个领域的专家制定。这些问题的答案是经过专家验证的模型生成的答案。每个声明-证据对都由专家判断其信息性、事实性、引用价值、声明是否由证据支持以及证据来源的可靠性。此外,专家还修订原始声明,确保其事实性并由可信来源支持。

数据集结构

数据实例

  • default 子集提供主要数据,包含事实性和归属性的判断。
  • lfqa_domain(领域分割)和 lfqa_rand(随机分割)提供长形式问答数据分割。

数据字段

主要数据文件包含以下字段:

  • question:专家编写的问题。
  • annotator_id:问题作者的匿名标注者 ID。
  • answers:映射模型名称到 Answer 对象的字典。模型名称可以是 {gpt4, bing_chat, rr_sphere_gpt4, rr_gs_gpt4, post_hoc_sphere_gpt4, post_hoc_gs_gpt4} 之一。
  • metadata:包含以下字段:
    • question_type:用 "|" 分隔的问题类型。
    • field:标注者所属的领域。
    • specific_field:更具体的领域名称。

每个 Answer 对象包含以下字段:

  • answer_string:答案字符串。
  • attribution:答案的证据列表(不与特定声明关联)。
  • claims:答案的 Claim 对象列表。
  • revised_answer_string:标注者修订的答案。
  • usefulness:标注者标记的原始答案的有用性。
  • annotation_time:标注此答案所花费的时间。
  • annotator_id:验证此答案的匿名标注者 ID。

每个 Claim 对象包含以下字段:

  • claim_string:原始声明字符串。
  • evidence:声明的证据列表(URL+段落或 URL)。
  • support:标注者标记的归属。
  • reason_missing_support:标注者指定的缺少支持的原因。
  • informativeness:标注者标记的声明对问题的信息性。
  • worthiness:标注者标记的引用声明的价值。
  • correctness:标注者标记的声明的事实正确性。
  • reliability:标注者标记的来源证据的可靠性。
  • revised_claim:标注者修订的声明。
  • revised_evidence:标注者修订的证据。

引用信息

@inproceedings{malaviya23expertqa, title = {ExpertQA: Expert-Curated Questions and Attributed Answers}, author = {Chaitanya Malaviya and Subin Lee and Sihao Chen and Elizabeth Sieber and Mark Yatskar and Dan Roth}, booktitle = {arXiv}, month = {September}, year = {2023}, url = "https://arxiv.org/abs/2309.07852" }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作