AryanLala/autonlp-data-Scientific_Title_Generator
收藏Hugging Face2021-11-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/AryanLala/autonlp-data-Scientific_Title_Generator
下载链接
链接失效反馈官方服务:
资源简介:
该数据集由AutoNLP自动处理,用于科学标题生成项目。数据集包含两个主要字段:target和text,均为字符串类型。数据集被分为训练集和验证集,训练集包含5784个样本,验证集包含1446个样本。数据集的语言代码为unk,具体语言未明确说明。
提供机构:
AryanLala
原始信息汇总
数据集概述
数据集名称
- 项目名称: Scientific_Title_Generator
- 任务类别: conditional-text-generation
语言信息
- 语言代码: unk
数据集结构
数据实例
- 样本示例: json [ { "target": "Unification of Fusion Theories, Rules, Filters, Image Fusion and Target Tracking Methods (UFT)", "text": "The author has pledged in various papers, conference or seminar presentations, and scientific grant applications (between 2004-2015) for the unification of fusion theories, combinations of fusion rules, image fusion procedures, filter algorithms, and target tracking methods for more accurate applications to our real world problems - since neither fusion theory nor fusion rule fully satisfy all needed applications. For each particular application, one selects the most appropriate fusion space and fusion model, then the fusion rules, and the algorithms of implementation. He has worked in the Unification of the Fusion Theories (UFT), which looks like a cooking recipe, better one could say like a logical chart for a computer programmer, but one does not see another method to comprise/unify all things. The unification scenario presented herein, which is now in an incipient form, should periodically be updated incorporating new discoveries from the fusion and engineering research." }, { "target": "Investigation of Variances in Belief Networks", "text": "The belief network is a well-known graphical structure for representing independences in a joint probability distribution. The methods, which perform probabilistic inference in belief networks, often treat the conditional probabilities which are stored in the network as certain values. However, if one takes either a subjectivistic or a limiting frequency approach to probability, one can never be certain of probability values. An algorithm should not only be capable of reporting the probabilities of the alternatives of remaining nodes when other nodes are instantiated; it should also be capable of reporting the uncertainty in these probabilities relative to the uncertainty in the probabilities which are stored in the network. In this paper a method for determining the variances in inferred probabilities is obtained under the assumption that a posterior distribution on the uncertainty variables can be approximated by the prior distribution. It is shown that this assumption is plausible if their is a reasonable amount of confidence in the probabilities which are stored in the network. Furthermore in this paper, a surprising upper bound for the prior variances in the probabilities of the alternatives of all nodes is obtained in the case where the probability distributions of the probabilities of the alternatives are beta distributions. It is shown that the prior variance in the probability at an alternative of a node is bounded above by the largest variance in an element of the conditional probability distribution for that node." } ]
数据字段
- 字段信息: json { "target": "Value(dtype=string, id=None)", "text": "Value(dtype=string, id=None)" }
数据分割
- 分割详情:
Split name Num samples train 5784 valid 1446



