five

hover-nlp/hover

收藏
Hugging Face2024-01-18 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/hover-nlp/hover
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - expert-generated language_creators: - expert-generated - found language: - en license: - cc-by-sa-4.0 multilinguality: - monolingual size_categories: - 10K<n<100K source_datasets: - original task_categories: - text-retrieval task_ids: - fact-checking-retrieval paperswithcode_id: hover pretty_name: HoVer dataset_info: features: - name: id dtype: int32 - name: uid dtype: string - name: claim dtype: string - name: supporting_facts list: - name: key dtype: string - name: value dtype: int32 - name: label dtype: class_label: names: '0': NOT_SUPPORTED '1': SUPPORTED - name: num_hops dtype: int32 - name: hpqa_id dtype: string splits: - name: train num_bytes: 5532178 num_examples: 18171 - name: validation num_bytes: 1299252 num_examples: 4000 - name: test num_bytes: 927513 num_examples: 4000 download_size: 12257835 dataset_size: 7758943 --- # Dataset Card for HoVer ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** https://hover-nlp.github.io/ - **Repository:** https://github.com/hover-nlp/hover - **Paper:** https://arxiv.org/abs/2011.03088 - **Leaderboard:** https://hover-nlp.github.io/ - **Point of Contact:** [More Information Needed] ### Dataset Summary [More Information Needed] ### Supported Tasks and Leaderboards [More Information Needed] ### Languages [More Information Needed] ## Dataset Structure ### Data Instances A sample training set is provided below ``` {'id': 14856, 'uid': 'a0cf45ea-b5cd-4c4e-9ffa-73b39ebd78ce', 'claim': 'The park at which Tivolis Koncertsal is located opened on 15 August 1843.', 'supporting_facts': [{'key': 'Tivolis Koncertsal', 'value': 0}, {'key': 'Tivoli Gardens', 'value': 1}], 'label': 'SUPPORTED', 'num_hops': 2, 'hpqa_id': '5abca1a55542993a06baf937'} ``` Please note that in test set sentence only id, uid and claim are available. Labels are not available in test set and are represented by -1. ### Data Fields [More Information Needed] ### Data Splits [More Information Needed] ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data [More Information Needed] #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations [More Information Needed] #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information [More Information Needed] ### Citation Information [More Information Needed] ### Contributions Thanks to [@abhishekkrthakur](https://github.com/abhishekkrthakur) for adding this dataset.
提供机构:
hover-nlp
原始信息汇总

数据集描述

  • annotations_creators:
    • expert-generated
  • language_creators:
    • expert-generated
    • found
  • language:
    • en
  • license:
    • cc-by-sa-4.0
  • multilinguality:
    • monolingual
  • size_categories:
    • 10K<n<100K
  • source_datasets:
    • original
  • task_categories:
    • text-retrieval
  • task_ids:
    • fact-checking-retrieval
  • paperswithcode_id:
    • hover
  • pretty_name:
    • HoVer

数据集结构

特征

  • id:
    • dtype: int32
  • uid:
    • dtype: string
  • claim:
    • dtype: string
  • supporting_facts:
    • list:
      • key:
        • dtype: string
      • value:
        • dtype: int32
  • label:
    • dtype:
      • class_label:
        • names:
          • 0: NOT_SUPPORTED
          • 1: SUPPORTED
  • num_hops:
    • dtype: int32
  • hpqa_id:
    • dtype: string

数据分割

  • train:
    • num_bytes: 5532178
    • num_examples: 18171
  • validation:
    • num_bytes: 1299252
    • num_examples: 4000
  • test:
    • num_bytes: 927513
    • num_examples: 4000
  • download_size:
    • 12257835
  • dataset_size:
    • 7758943

数据实例

{id: 14856, uid: a0cf45ea-b5cd-4c4e-9ffa-73b39ebd78ce, claim: The park at which Tivolis Koncertsal is located opened on 15 August 1843., supporting_facts: [{key: Tivolis Koncertsal, value: 0}, {key: Tivoli Gardens, value: 1}], label: SUPPORTED, num_hops: 2, hpqa_id: 5abca1a55542993a06baf937}

请注意,在测试集中,仅包含 id, uidclaim。标签在测试集中不可用,并以 -1 表示。

搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作