five

asdf6675/tab_fact

收藏
Hugging Face2026-03-10 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/asdf6675/tab_fact
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - crowdsourced language_creators: - crowdsourced language: - en license: - cc-by-4.0 multilinguality: - monolingual size_categories: - 100K<n<1M source_datasets: - original task_categories: - text-classification task_ids: - fact-checking paperswithcode_id: tabfact pretty_name: TabFact dataset_info: - config_name: tab_fact features: - name: id dtype: int32 - name: table_id dtype: string - name: table_text dtype: string - name: table_caption dtype: string - name: statement dtype: string - name: label dtype: class_label: names: '0': refuted '1': entailed splits: - name: train num_bytes: 99852664 num_examples: 92283 - name: validation num_bytes: 13846872 num_examples: 12792 - name: test num_bytes: 13493391 num_examples: 12779 download_size: 196508436 dataset_size: 127192927 - config_name: blind_test features: - name: id dtype: int32 - name: table_id dtype: string - name: table_text dtype: string - name: table_caption dtype: string - name: statement dtype: string - name: test_id dtype: string splits: - name: test num_bytes: 10954442 num_examples: 9750 download_size: 196508436 dataset_size: 10954442 --- # Dataset Card for TabFact ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** [TabFact](https://tabfact.github.io/index.html) - **Repository:** [GitHub](https://github.com/wenhuchen/Table-Fact-Checking) - **Paper:** [TabFact: A Large-scale Dataset for Table-based Fact Verification](https://arxiv.org/abs/1909.02164) - **Leaderboard:** [Leaderboard](https://competitions.codalab.org/competitions/21611) - **Point of Contact:** [Wenhu Chen](wenhuchen@cs.ucsb.edu) ### Dataset Summary The problem of verifying whether a textual hypothesis holds the truth based on the given evidence, also known as fact verification, plays an important role in the study of natural language understanding and semantic representation. However, existing studies are restricted to dealing with unstructured textual evidence (e.g., sentences and passages, a pool of passages), while verification using structured forms of evidence, such as tables, graphs, and databases, remains unexplored. TABFACT is large scale dataset with 16k Wikipedia tables as evidence for 118k human annotated statements designed for fact verification with semi-structured evidence. The statements are labeled as either ENTAILED or REFUTED. TABFACT is challenging since it involves both soft linguistic reasoning and hard symbolic reasoning. ### Supported Tasks and Leaderboards [More Information Needed] ### Languages [More Information Needed] ## Dataset Structure ### Data Instances [More Information Needed] ### Data Fields [More Information Needed] ### Data Splits [More Information Needed] ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data [More Information Needed] #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations [More Information Needed] #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information [More Information Needed] ### Citation Information ``` @inproceedings{2019TabFactA, title={TabFact : A Large-scale Dataset for Table-based Fact Verification}, author={Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou and William Yang Wang}, booktitle = {International Conference on Learning Representations (ICLR)}, address = {Addis Ababa, Ethiopia}, month = {April}, year = {2020} } ``` ### Contributions Thanks to [@patil-suraj](https://github.com/patil-suraj) for adding this dataset.
提供机构:
asdf6675
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作