asdf6675/tab_fact

Name: asdf6675/tab_fact
Creator: asdf6675
Published: 2026-03-10 02:25:43
License: 暂无描述

Hugging Face2026-03-10 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/asdf6675/tab_fact

下载链接

链接失效反馈

官方服务：

资源简介：

--- annotations_creators: - crowdsourced language_creators: - crowdsourced language: - en license: - cc-by-4.0 multilinguality: - monolingual size_categories: - 100K<n<1M source_datasets: - original task_categories: - text-classification task_ids: - fact-checking paperswithcode_id: tabfact pretty_name: TabFact dataset_info: - config_name: tab_fact features: - name: id dtype: int32 - name: table_id dtype: string - name: table_text dtype: string - name: table_caption dtype: string - name: statement dtype: string - name: label dtype: class_label: names: '0': refuted '1': entailed splits: - name: train num_bytes: 99852664 num_examples: 92283 - name: validation num_bytes: 13846872 num_examples: 12792 - name: test num_bytes: 13493391 num_examples: 12779 download_size: 196508436 dataset_size: 127192927 - config_name: blind_test features: - name: id dtype: int32 - name: table_id dtype: string - name: table_text dtype: string - name: table_caption dtype: string - name: statement dtype: string - name: test_id dtype: string splits: - name: test num_bytes: 10954442 num_examples: 9750 download_size: 196508436 dataset_size: 10954442 --- # Dataset Card for TabFact ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** [TabFact](https://tabfact.github.io/index.html) - **Repository:** [GitHub](https://github.com/wenhuchen/Table-Fact-Checking) - **Paper:** [TabFact: A Large-scale Dataset for Table-based Fact Verification](https://arxiv.org/abs/1909.02164) - **Leaderboard:** [Leaderboard](https://competitions.codalab.org/competitions/21611) - **Point of Contact:** [Wenhu Chen](wenhuchen@cs.ucsb.edu) ### Dataset Summary The problem of verifying whether a textual hypothesis holds the truth based on the given evidence, also known as fact verification, plays an important role in the study of natural language understanding and semantic representation. However, existing studies are restricted to dealing with unstructured textual evidence (e.g., sentences and passages, a pool of passages), while verification using structured forms of evidence, such as tables, graphs, and databases, remains unexplored. TABFACT is large scale dataset with 16k Wikipedia tables as evidence for 118k human annotated statements designed for fact verification with semi-structured evidence. The statements are labeled as either ENTAILED or REFUTED. TABFACT is challenging since it involves both soft linguistic reasoning and hard symbolic reasoning. ### Supported Tasks and Leaderboards [More Information Needed] ### Languages [More Information Needed] ## Dataset Structure ### Data Instances [More Information Needed] ### Data Fields [More Information Needed] ### Data Splits [More Information Needed] ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data [More Information Needed] #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations [More Information Needed] #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information [More Information Needed] ### Citation Information ``` @inproceedings{2019TabFactA, title={TabFact : A Large-scale Dataset for Table-based Fact Verification}, author={Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou and William Yang Wang}, booktitle = {International Conference on Learning Representations (ICLR)}, address = {Addis Ababa, Ethiopia}, month = {April}, year = {2020} } ``` ### Contributions Thanks to [@patil-suraj](https://github.com/patil-suraj) for adding this dataset.

提供机构：

asdf6675

5,000+

优质数据集

54 个

任务类型

进入经典数据集