five

google-research-datasets/totto

收藏
Hugging Face2024-01-18 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/google-research-datasets/totto
下载链接
链接失效反馈
官方服务:
资源简介:
ToTTo是一个开放领域的英文表格到文本数据集,包含超过120,000个训练示例,旨在提出一个控制生成任务:给定一个维基百科表格和一组高亮显示的表格单元,生成一个单句描述。该数据集是单语种的,注释由专家生成,语言数据来源于现有资料。它根据CC-BY-SA-3.0许可证授权,确保其使用是免费和开放的。数据集包括表格元数据、高亮单元和句子注释等特征,这些对于生成描述任务至关重要。它被分为训练、验证和测试集,每个集合都有特定的规模和特点。README还包含一个样本数据实例和数据字段的描述,提供了对数据集结构和内容的清晰理解。

ToTTo是一个开放领域的英文表格到文本数据集,包含超过120,000个训练示例,旨在提出一个控制生成任务:给定一个维基百科表格和一组高亮显示的表格单元,生成一个单句描述。该数据集是单语种的,注释由专家生成,语言数据来源于现有资料。它根据CC-BY-SA-3.0许可证授权,确保其使用是免费和开放的。数据集包括表格元数据、高亮单元和句子注释等特征,这些对于生成描述任务至关重要。它被分为训练、验证和测试集,每个集合都有特定的规模和特点。README还包含一个样本数据实例和数据字段的描述,提供了对数据集结构和内容的清晰理解。
提供机构:
google-research-datasets
原始信息汇总

数据集卡片 for ToTTo

数据集描述

数据集摘要

ToTTo 是一个开放领域的英语表格到文本数据集,包含超过 120,000 个训练样本,提出了一项受控生成任务:给定一个维基百科表格和一组高亮显示的表格单元格,生成一个句子描述。

支持的任务和排行榜

[更多信息需补充]

语言

[更多信息需补充]

数据集结构

数据实例

一个示例训练集如下: json { "example_id": "1762238357686640028", "highlighted_cells": [[13, 2]], "id": 0, "overlap_subset": "none", "sentence_annotations": { "final_sentence": ["A Favorita is the telenovela aired in the 9 pm timeslot."], "original_sentence": ["It is also the first telenovela by the writer to air in the 9 pm timeslot."], "sentence_after_ambiguity": ["A Favorita is the telenovela aired in the 9 pm timeslot."], "sentence_after_deletion": ["It is the telenovela air in the 9 pm timeslot."] }, "table": [ [ {"column_span": 1, "is_header": True, "row_span": 1, "value": "#"}, {"column_span": 1, "is_header": True, "row_span": 1, "value": "Run"}, {"column_span": 1, "is_header": True, "row_span": 1, "value": "Title"}, {"column_span": 1, "is_header": True, "row_span": 1, "value": "Chapters"}, {"column_span": 1, "is_header": True, "row_span": 1, "value": "Author"}, {"column_span": 1, "is_header": True, "row_span": 1, "value": "Director"}, {"column_span": 1, "is_header": True, "row_span": 1, "value": "Ibope Rating"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "59"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "June 5, 2000— February 2, 2001"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Laços de Família"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "209"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Manoel Carlos"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Ricardo Waddington"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "44.9"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "60"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "February 5, 2001— September 28, 2001"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Porto dos Milagres"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "203"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Aguinaldo Silva Ricardo Linhares"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Marcos Paulo Simões"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "44.6"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "61"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "October 1, 2001— June 14, 2002"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "O Clone"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "221"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Glória Perez"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Jayme Monjardim"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "47.0"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "62"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "June 17, 2002— February 14, 2003"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Esperança"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "209"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Benedito Ruy Barbosa"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Luiz Fernando"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "37.7"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "63"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "February 17, 2003— October 10, 2003"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Mulheres Apaixonadas"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "203"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Manoel Carlos"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Ricardo Waddington"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "46.6"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "64"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "October 13, 2003— June 25, 2004"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Celebridade"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "221"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Gilberto Braga"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Dennis Carvalho"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "46.0"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "65"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "June 28, 2004— March 11, 2005"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Senhora do Destino"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "221"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Aguinaldo Silva"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Wolf Maya"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "50.4"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "66"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "March 14, 2005— November 4, 2005"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "América"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "203"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Glória Perez"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Jayme Monjardim Marcos Schechtman"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "49.4"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "67"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "November 7, 2005— July 7, 2006"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Belíssima"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "209"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Sílvio de Abreu"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Denise Saraceni"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "48.5"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "68"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "July 10, 2006— March 2, 2007"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Páginas da Vida"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "203"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Manoel Carlos"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Jayme Monjardim"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "46.8"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "69"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "March 5, 2007— September 28, 2007"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Paraíso Tropical"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "179"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Gilberto Braga Ricardo Linhares"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Dennis Carvalho"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "42.8"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "70"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "October 1, 2007— May 31, 2008"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Duas Caras"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "210"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Aguinaldo Silva"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Wolf Maya"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "41.1"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "71"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "June 2, 2008— January 16, 2009"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "A Favorita"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "197"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "João Emanuel Carneiro"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Ricardo Waddington"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "39.5"} ], [ {"column_span": 1, "is_header": False, "row_span": 1, "value": "72"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "January 19, 2009— September 11, 2009"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "Caminho das Índias"}, {"column_span": 1, "is_header": False, "row_span": 1, "value": "203"},

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作