five

AntoineBlanot/snli-contrast

收藏
Hugging Face2023-11-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/AntoineBlanot/snli-contrast
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: premise dtype: string - name: hypothesis dtype: string - name: instruction dtype: string - name: label_name dtype: string splits: - name: train num_bytes: 283196540 num_examples: 1098734 - name: test num_bytes: 5199496 num_examples: 19684 download_size: 23437414 dataset_size: 288396036 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* --- # Dataset Card for "snli-contrast" This dataset is the [snli-3way](https://huggingface.co/datasets/AntoineBlanot/snli-3way) dataset with an additional `instruction` feature. This new feature along with its related `label_name` expresses how the `premise` and `hypothesis` features are related in the original dataset. The following explains how the mapping is done: ### If the original example was of class `entailment` Two data points will be related to that example. One is the positive example (i.e., `label_name` == "positive") which assign to it the folowing instruction: "The meaning of the hypothesis is logically inferred from the meaning of the premise." The other is the negative example (i.e., `label_name` == "negative") which assign to it the folowing instruction: "The meaning of the hypothesis either contradicts the meaning of the premise, is unrelated to it, or does not provide sufficient information to infer the meaning of the premise." ### If the original example was of class `contradiction` or `neutral` Two data points will be related to that example. One is the positive example (i.e., `label_name` == "positive") which assign to it the folowing instruction: "The meaning of the hypothesis either contradicts the meaning of the premise, is unrelated to it, or does not provide sufficient information to infer the meaning of the premise." The other is the negative example (i.e., `label_name` == "negative") which assign to it the folowing instruction: "The meaning of the hypothesis is logically inferred from the meaning of the premise." This dataset is double the size of this original dataset because each is related to a positive and negative instruction. [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
AntoineBlanot
原始信息汇总

数据集概述

数据集信息

  • 特征:

    • premise: 字符串类型
    • hypothesis: 字符串类型
    • instruction: 字符串类型
    • label_name: 字符串类型
  • 数据划分:

    • train: 字节数为283196540,样本数为1098734
    • test: 字节数为5199496,样本数为19684
  • 数据大小:

    • 下载大小: 23437414字节
    • 数据集大小: 288396036字节

配置信息

  • 默认配置:
    • train 数据文件路径: data/train-*
    • test 数据文件路径: data/test-*

数据集描述

该数据集是snli-3way数据集的扩展,新增了instruction特征。该特征及其相关的label_name表达了premisehypothesis特征在原始数据集中的关系。

  • 映射规则:
    • 如果原始样本属于entailment类别:

      • 正例 (label_name == "positive"): 指令为 "The meaning of the hypothesis is logically inferred from the meaning of the premise."
      • 负例 (label_name == "negative"): 指令为 "The meaning of the hypothesis either contradicts the meaning of the premise, is unrelated to it, or does not provide sufficient information to infer the meaning of the premise."
    • 如果原始样本属于contradictionneutral类别:

      • 正例 (label_name == "positive"): 指令为 "The meaning of the hypothesis either contradicts the meaning of the premise, is unrelated to it, or does not provide sufficient information to infer the meaning of the premise."
      • 负例 (label_name == "negative"): 指令为 "The meaning of the hypothesis is logically inferred from the meaning of the premise."

该数据集是原始数据集的两倍大小,因为每个样本都关联了一个正例和一个负例的指令。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作