five

Serverless/dev_mode-wtq

收藏
Hugging Face2023-02-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Serverless/dev_mode-wtq
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - crowdsourced language_creators: - found language: - en license: - cc-by-4.0 multilinguality: - monolingual paperswithcode_id: null pretty_name: WikiTableQuestions-wtq size_categories: - 10K<n<100K source_datasets: - wikitablequestions task_categories: - question-answering task_ids: [] tags: - table-question-answering --- # Dataset Card for dev_mode-wtq ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-instances) - [Data Splits](#data-instances) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) ## Dataset Description - **Homepage:** [WikiTableQuestions homepage](https://nlp.stanford.edu/software/sempre/wikitable) - **Repository:** [WikiTableQuestions repository](https://github.com/ppasupat/WikiTableQuestions) - **Paper:** [Compositional Semantic Parsing on Semi-Structured Tables](https://arxiv.org/abs/1508.00305) - **Leaderboard:** [WikiTableQuestions leaderboard on PaperWithCode](https://paperswithcode.com/dataset/wikitablequestions) - **Point of Contact:** [Needs More Information] ### Dataset Summary The dev_mode-wtq dataset is a small-scale dataset for the task of question answering on semi-structured tables. This data includes the `aggregation_label` and `answer_coordinates` to make it easy to train this model on any [TAPAS](https://huggingface.co/docs/transformers/model_doc/tapas#usage-finetuning) based modles. ### Supported Tasks and Leaderboards question-answering, table-question-answering ### Languages en ## Dataset Structure ### Data Instances #### default - **Size of downloaded dataset files:** 27.91 MB - **Size of the generated dataset:** 45.68 MB - **Total amount of disk used:** 73.60 MB An example of 'validation' looks as follows: ``` { "id": "nt-0", "question": "What is the duration for the last invocation?", "answers": [ "340 ms" ], "table": { "header": [ "recent", "type", "spans", "logs", "errors", "warnings", "duration", "resource" ], "rows": [ [ "1", "span", "1", "1", "1", "2", "340 ms", "aws-lambda-typescript-express-dev-express" ] ] } } ``` ### Data Fields The data fields are the same among all splits. #### default - `id`: a `string` feature. - `question`: a `string` feature. - `answers`: a `list` of `string` feature. - `answers_coordinates`: a `list` of `int,int` tuples. - `aggregation_label`: a `string` feature. - `table`: a dictionary feature containing: - `header`: a `list` of `string` features. - `rows`: a `list` of `list` of `string` features: - `name`: a `string` feature. ### Data Splits TBA ## Dataset Creation ### Curation Rationale [Needs More Information] ### Source Data #### Initial Data Collection and Normalization [Needs More Information] #### Who are the source language producers? [Needs More Information] ### Annotations #### Annotation process [Needs More Information] #### Who are the annotators? [Needs More Information] ### Personal and Sensitive Information [Needs More Information] ## Considerations for Using the Data ### Social Impact of Dataset [Needs More Information] ### Discussion of Biases [Needs More Information] ### Other Known Limitations [Needs More Information] ## Additional Information ### Dataset Curators Panupong Pasupat and Percy Liang ### Licensing Information Creative Commons Attribution Share Alike 4.0 International ### Citation Information ``` @inproceedings{pasupat-liang-2015-compositional, title = "Compositional Semantic Parsing on Semi-Structured Tables", author = "Pasupat, Panupong and Liang, Percy", booktitle = "Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)", month = jul, year = "2015", address = "Beijing, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P15-1142", doi = "10.3115/v1/P15-1142", pages = "1470--1480", } ``` ### Contributions Thanks to [@SivilTaram](https://github.com/SivilTaram) for adding this dataset.
提供机构:
Serverless
原始信息汇总

数据集概述

数据集名称

  • 名称: dev_mode-wtq
  • 别名: WikiTableQuestions-wtq

数据集基本信息

  • 语言: 英语 (en)
  • 许可证: Creative Commons Attribution Share Alike 4.0 International (cc-by-4.0)
  • 多语言性: 单语
  • 大小: 10K<n<100K

数据集创建

  • 注释创建者: 众包
  • 语言创建者: 发现
  • 数据集来源: wikitablequestions

任务与应用

  • 任务类别: 问答 (question-answering), 表格问答 (table-question-answering)

数据集结构

  • 数据实例:

    • 下载大小: 27.91 MB
    • 生成数据集大小: 45.68 MB
    • 总磁盘使用: 73.60 MB
    • 示例: json { "id": "nt-0", "question": "What is the duration for the last invocation?", "answers": ["340 ms"], "table": { "header": ["recent", "type", "spans", "logs", "errors", "warnings", "duration", "resource"], "rows": [["1", "span", "1", "1", "1", "2", "340 ms", "aws-lambda-typescript-express-dev-express"]] } }
  • 数据字段:

    • id: 字符串
    • question: 字符串
    • answers: 字符串列表
    • answers_coordinates: 整数元组列表
    • aggregation_label: 字符串
    • table: 字典,包含表头和行数据

数据集创建者

  • 数据集创建者: Panupong Pasupat 和 Percy Liang

许可证信息

  • 许可证: Creative Commons Attribution Share Alike 4.0 International

引用信息

@inproceedings{pasupat-liang-2015-compositional, title = "Compositional Semantic Parsing on Semi-Structured Tables", author = "Pasupat, Panupong and Liang, Percy", booktitle = "Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)", month = jul, year = "2015", address = "Beijing, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P15-1142", doi = "10.3115/v1/P15-1142", pages = "1470--1480", }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作