ToTTo|自然语言处理数据集|文本生成数据集
收藏
- ToTTo数据集首次发表于2020年,由Google Research团队在自然语言处理领域的重要会议EMNLP上正式发布。该数据集旨在推动表格到文本生成任务的研究,包含了超过16万个人工标注的表格和对应的自然语言描述。
- ToTTo数据集在2021年首次应用于多个自然语言处理模型中,显著提升了这些模型在表格数据理解和生成文本方面的性能。研究者们开始利用该数据集进行模型训练和评估,推动了相关技术的发展。
- 2022年,ToTTo数据集成为多个国际竞赛和挑战赛的标准数据集之一,吸引了全球范围内的研究者和开发者参与。这一年的研究成果进一步验证了ToTTo在提升模型生成质量和多样性方面的潜力。
- 1ToTTo: A Controlled Table-To-Text Generation DatasetGoogle Research · 2020年
- 2Exploring the Limits of Transfer Learning with a Unified Text-to-Text TransformerGoogle Research · 2020年
- 3Table-to-Text Generation with Effective Hierarchical Encoder-Decoder ModelsUniversity of Cambridge · 2020年
- 4A Survey on Table-to-Text GenerationUniversity of Science and Technology of China · 2021年
- 5Improving Table-to-Text Generation with External KnowledgeTsinghua University · 2021年
Hang Seng Index
恒生指数(Hang Seng Index)是香港股市的主要股票市场指数,由恒生银行旗下的恒生指数有限公司编制。该指数涵盖了香港股票市场中最具代表性的50家上市公司,反映了香港股市的整体表现。
www.hsi.com.hk 收录
YOLO Drone Detection Dataset
为了促进无人机检测模型的开发和评估,我们引入了一个新颖且全面的数据集,专门为训练和测试无人机检测算法而设计。该数据集来源于Kaggle上的公开数据集,包含在各种环境和摄像机视角下捕获的多样化的带注释图像。数据集包括无人机实例以及其他常见对象,以实现强大的检测和分类。
github 收录
flames-and-smoke-datasets
该仓库总结了多个公开的火焰和烟雾数据集,包括DFS、D-Fire dataset、FASDD、FLAME、BoWFire、VisiFire、fire-smoke-detect-yolov4、Forest Fire等数据集。每个数据集都有详细的描述,包括数据来源、图像数量、标注信息等。
github 收录
DFT dataset for high entropy alloys
我们的DFT数据集涵盖了由八种元素组成的bcc和fcc结构,包括所有可能的2至7元合金系统。该数据集在Zenodo上公开可用,包含初始和最终结构、形成能量、原子磁矩和电荷等属性。
github 收录
The sex [Male (1) and female (2)] age (in years), weight (in lbs), #GPS Points (total after filtering), and the 100% MCP, 95% KDE, and 50% KDE home ranges (ha) for all cats sampled in the study. All cats were desexed. The personality scores (shown as a percent), were obtained from a survey, based on the “Feline Five” (Litchfield et al., 2017), that evaluated how much owners agreed or disagreed that their cats showed certain traits. Traits were then summed and converted into percentages. Bold cats are considered to have a low neuroticism score. Road density was estimated by summing the road lengths, measured in meters, within a fixed boundary centred on each cat’s mean latitude and longitude coordinates. The variable “major road” indicated the presence (1) or absence (0) of a major road near the cat’s home range. Roads were labeled as “major” based on Google Maps’ classification, related to traffic rates, and through “ground-truthing”.
Domestic cats (<i>Felis catus</i>) play a dual role in society as both companion animals and predators. When provided with unsupervised outdoor access, cats can negatively impact native wildlife and create public health and animal welfare challenges. The effective implementation of management strategies, such as buffer zones or curfews, requires an understanding of home range size, the factors that influence their movement, and the types of habitats they use. Here, we used a community/citizen scientist approach to collect movement and habitat use data using GPS collars on owned outdoor cats in the Kitchener-Waterloo-Cambridge-Guelph region, southwestern Ontario, Canada.
DataCite Commons 收录