AryanLala/autonlp-data-Scientific_Title_Generator

Name: AryanLala/autonlp-data-Scientific_Title_Generator
Creator: AryanLala
Published: 2021-11-20 18:00:56
License: 暂无描述

Hugging Face2021-11-20 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/AryanLala/autonlp-data-Scientific_Title_Generator

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集由AutoNLP自动处理，用于科学标题生成项目。数据集包含两个主要字段：target和text，均为字符串类型。数据集被分为训练集和验证集，训练集包含5784个样本，验证集包含1446个样本。数据集的语言代码为unk，具体语言未明确说明。

提供机构：

AryanLala

原始信息汇总

数据集概述

数据集名称

项目名称: Scientific_Title_Generator
任务类别: conditional-text-generation

语言信息

语言代码: unk

数据集结构

数据实例

样本示例: json [ { "target": "Unification of Fusion Theories, Rules, Filters, Image Fusion and Target Tracking Methods (UFT)", "text": "The author has pledged in various papers, conference or seminar presentations, and scientific grant applications (between 2004-2015) for the unification of fusion theories, combinations of fusion rules, image fusion procedures, filter algorithms, and target tracking methods for more accurate applications to our real world problems - since neither fusion theory nor fusion rule fully satisfy all needed applications. For each particular application, one selects the most appropriate fusion space and fusion model, then the fusion rules, and the algorithms of implementation. He has worked in the Unification of the Fusion Theories (UFT), which looks like a cooking recipe, better one could say like a logical chart for a computer programmer, but one does not see another method to comprise/unify all things. The unification scenario presented herein, which is now in an incipient form, should periodically be updated incorporating new discoveries from the fusion and engineering research." }, { "target": "Investigation of Variances in Belief Networks", "text": "The belief network is a well-known graphical structure for representing independences in a joint probability distribution. The methods, which perform probabilistic inference in belief networks, often treat the conditional probabilities which are stored in the network as certain values. However, if one takes either a subjectivistic or a limiting frequency approach to probability, one can never be certain of probability values. An algorithm should not only be capable of reporting the probabilities of the alternatives of remaining nodes when other nodes are instantiated; it should also be capable of reporting the uncertainty in these probabilities relative to the uncertainty in the probabilities which are stored in the network. In this paper a method for determining the variances in inferred probabilities is obtained under the assumption that a posterior distribution on the uncertainty variables can be approximated by the prior distribution. It is shown that this assumption is plausible if their is a reasonable amount of confidence in the probabilities which are stored in the network. Furthermore in this paper, a surprising upper bound for the prior variances in the probabilities of the alternatives of all nodes is obtained in the case where the probability distributions of the probabilities of the alternatives are beta distributions. It is shown that the prior variance in the probability at an alternative of a node is bounded above by the largest variance in an element of the conditional probability distribution for that node." } ]

数据字段

字段信息: json { "target": "Value(dtype=string, id=None)", "text": "Value(dtype=string, id=None)" }

数据分割

分割详情:

Split name Num samples

train 5784

valid 1446

5,000+

优质数据集

54 个

任务类型

进入经典数据集

Split name	Num samples
train	5784
valid	1446