hreensajin/autotrain-data-text_summ

Name: hreensajin/autotrain-data-text_summ
Creator: hreensajin
Published: 2022-11-23 09:18:18
License: 暂无描述

Hugging Face2022-11-23 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/hreensajin/autotrain-data-text_summ

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是AutoTrain为项目text_summ自动处理的数据集，任务类别为条件文本生成。数据集的BCP-47语言代码为unk，表示语言未知。数据集包含feat_guid、feat_source、text、target和feat_label等字段，其中feat_label是一个包含三个类别（contradiction、entailment、neutral）的ClassLabel。数据集划分为训练集和验证集，分别包含22398和5600个样本。

提供机构：

hreensajin

原始信息汇总

AutoTrain Dataset for project: text_summ

Dataset Description

Task Category: Conditional Text Generation
Language: The datasets language is identified by the BCP-47 code unk.

Dataset Structure

Data Instances

Sample Instance: json [ { "feat_guid": "klue-nli-v1_train_06870", "feat_source": "policy", "text": "ub610ud55c ub3c5ub9bducd9cud310ubb3c uc81cuc791 uc6ccud06cuc20duc774ub791 uae00uc4f0uae30 uc18cuc124 uc4f0uae30 uc6ccud06cuc20dub3c4 uc9c4ud589ud558uba70 uc6d0ub370uc774 uc218uc5c5uc744 ud1b5ud574 ub2e4uc591ud55c ud65cub3d9uc744 ud558ub294 uc54cucc2c ubcf5ud569uc801uc778 uacf5uac04uc774ub2e4.", "target": "ub3c5ub9bducd9cud310ubb3c uc81cuc791 uc6ccud06cuc20duc740 uc5b8ub860uc0acuc758 ud6c4uc6d0ud558uc5d0 uc9c4ud589ub41cub2e4.", "feat_label": 2 }, { "feat_guid": "klue-nli-v1_train_02196", "feat_source": "wikitree", "text": "uacf5uc720ub41c uc0acuc9c4uc5d0ub294 uc790uc6b1ud558uac8c uc548uac1cuc640 ubbf8uc138uba3cuc9c0uac00 ub0b4ub824uc549uc740 uc0acuc9c4uc774 uacf5uc720ub410ub2e4.", "target": "uc790uc6b1ud55c uc548uac1cuc640 ubbf8uc138uba3cuc9c0ub9cc ub0b4ub824uc549uc558ub2e4.", "feat_label": 2 } ]

Dataset Fields

Fields: json { "feat_guid": "Value(dtype=string, id=None)", "feat_source": "Value(dtype=string, id=None)", "text": "Value(dtype=string, id=None)", "target": "Value(dtype=string, id=None)", "feat_label": "ClassLabel(num_classes=3, names=[contradiction, entailment, neutral], id=None)" }

Dataset Splits

Splits:

Split name Num samples

train 22398

valid 5600

5,000+

优质数据集

54 个

任务类型

进入经典数据集

Split name	Num samples
train	22398
valid	5600