five

allenai/scifact_entailment

收藏
Hugging Face2023-09-27 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/allenai/scifact_entailment
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - expert-generated language: - en language_creators: - found license: - cc-by-nc-2.0 multilinguality: - monolingual pretty_name: SciFact size_categories: - 1K<n<10K source_datasets: - original task_categories: - text-classification task_ids: - fact-checking paperswithcode_id: scifact dataset_info: features: - name: claim_id dtype: int32 - name: claim dtype: string - name: abstract_id dtype: int32 - name: title dtype: string - name: abstract sequence: string - name: verdict dtype: string - name: evidence sequence: int32 splits: - name: train num_bytes: 1649655 num_examples: 919 - name: validation num_bytes: 605262 num_examples: 340 download_size: 3115079 dataset_size: 2254917 --- # Dataset Card for "scifact_entailment" ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Dataset Structure](#dataset-structure) - [Data Fields](#data-fields) - [Data Splits](#data-splits) ## Dataset Description - **Homepage:** [https://scifact.apps.allenai.org/](https://scifact.apps.allenai.org/) - **Repository:** <https://github.com/allenai/scifact> - **Paper:** [Fact or Fiction: Verifying Scientific Claims](https://aclanthology.org/2020.emnlp-main.609/) - **Point of Contact:** [David Wadden](mailto:davidw@allenai.org) ### Dataset Summary SciFact, a dataset of 1.4K expert-written scientific claims paired with evidence-containing abstracts, and annotated with labels and rationales. For more information on the dataset, see [allenai/scifact](https://huggingface.co/datasets/allenai/scifact). This has the same data, but reformatted as an entailment task. A single instance includes a claim paired with a paper title and abstract, together with an entailment label and a list of evidence sentences (if any). ## Dataset Structure ### Data fields - `claim_id`: An `int32` claim identifier. - `claim`: A `string`. - `abstract_id`: An `int32` abstract identifier. - `title`: A `string`. - `abstract`: A list of `strings`, one for each sentence in the abstract. - `verdict`: The fact-checking verdict, a `string`. - `evidence`: A list of sentences from the abstract which provide evidence for the verdict. ### Data Splits | |train|validation| |------|----:|---------:| |claims| 919 | 340|
提供机构:
allenai
原始信息汇总

数据集概述

数据集摘要

SciFact 是一个包含 1.4K 专家编写的科学声明的数据集,这些声明与包含证据的摘要配对,并带有标签和理由。该数据集被重新格式化为一个蕴含任务,每个实例包括一个声明与论文标题和摘要配对,以及一个蕴含标签和一个证据句子列表(如果有)。

数据集结构

数据字段

  • claim_id: 声明的标识符,类型为 int32
  • claim: 声明内容,类型为 string
  • abstract_id: 摘要的标识符,类型为 int32
  • title: 论文标题,类型为 string
  • abstract: 摘要中的句子列表,每个句子为一个 string
  • verdict: 事实核查的裁决,类型为 string
  • evidence: 摘要中提供证据支持裁决的句子列表。

数据分割

train validation
claims 919 340
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作