MoritzLaurer/dataset_test_disaggregated_nli
收藏Hugging Face2023-11-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/MoritzLaurer/dataset_test_disaggregated_nli
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: mnli_m
path: data/mnli_m-*
- split: mnli_mm
path: data/mnli_mm-*
- split: fevernli
path: data/fevernli-*
- split: anli_r1
path: data/anli_r1-*
- split: anli_r2
path: data/anli_r2-*
- split: anli_r3
path: data/anli_r3-*
- split: wanli
path: data/wanli-*
- split: lingnli
path: data/lingnli-*
- split: wellformedquery
path: data/wellformedquery-*
- split: rottentomatoes
path: data/rottentomatoes-*
- split: amazonpolarity
path: data/amazonpolarity-*
- split: imdb
path: data/imdb-*
- split: yelpreviews
path: data/yelpreviews-*
- split: hatexplain
path: data/hatexplain-*
- split: massive
path: data/massive-*
- split: banking77
path: data/banking77-*
- split: emotiondair
path: data/emotiondair-*
- split: emocontext
path: data/emocontext-*
- split: empathetic
path: data/empathetic-*
- split: agnews
path: data/agnews-*
- split: yahootopics
path: data/yahootopics-*
- split: biasframes_sex
path: data/biasframes_sex-*
- split: biasframes_offensive
path: data/biasframes_offensive-*
- split: biasframes_intent
path: data/biasframes_intent-*
- split: financialphrasebank
path: data/financialphrasebank-*
- split: appreviews
path: data/appreviews-*
- split: hateoffensive
path: data/hateoffensive-*
- split: trueteacher
path: data/trueteacher-*
- split: spam
path: data/spam-*
- split: wikitoxic_toxicaggregated
path: data/wikitoxic_toxicaggregated-*
- split: wikitoxic_obscene
path: data/wikitoxic_obscene-*
- split: wikitoxic_identityhate
path: data/wikitoxic_identityhate-*
- split: wikitoxic_threat
path: data/wikitoxic_threat-*
- split: wikitoxic_insult
path: data/wikitoxic_insult-*
- split: manifesto
path: data/manifesto-*
- split: capsotu
path: data/capsotu-*
dataset_info:
features:
- name: text
dtype: string
- name: hypothesis
dtype: string
- name: labels
dtype:
class_label:
names:
'0': entailment
'1': not_entailment
- name: task_name
dtype: string
- name: label_text
dtype: string
splits:
- name: mnli_m
num_bytes: 2055427
num_examples: 9815
- name: mnli_mm
num_bytes: 2181179
num_examples: 9832
- name: fevernli
num_bytes: 7532028
num_examples: 19652
- name: anli_r1
num_bytes: 433064
num_examples: 1000
- name: anli_r2
num_bytes: 432927
num_examples: 1000
- name: anli_r3
num_bytes: 501290
num_examples: 1200
- name: wanli
num_bytes: 940472
num_examples: 5000
- name: lingnli
num_bytes: 1078241
num_examples: 4893
- name: wellformedquery
num_bytes: 815799
num_examples: 5934
- name: rottentomatoes
num_bytes: 493664
num_examples: 2132
- name: amazonpolarity
num_bytes: 10798222
num_examples: 20000
- name: imdb
num_bytes: 27862150
num_examples: 20000
- name: yelpreviews
num_bytes: 15688830
num_examples: 20000
- name: hatexplain
num_bytes: 710204
num_examples: 2922
- name: massive
num_bytes: 23911774
num_examples: 175466
- name: banking77
num_bytes: 40018400
num_examples: 221760
- name: emotiondair
num_bytes: 2202560
num_examples: 12000
- name: emocontext
num_bytes: 3575972
num_examples: 22036
- name: empathetic
num_bytes: 52139926
num_examples: 81344
- name: agnews
num_bytes: 9630696
num_examples: 30400
- name: yahootopics
num_bytes: 343270530
num_examples: 500000
- name: biasframes_sex
num_bytes: 1830030
num_examples: 8808
- name: biasframes_offensive
num_bytes: 1785704
num_examples: 7676
- name: biasframes_intent
num_bytes: 1592094
num_examples: 7296
- name: financialphrasebank
num_bytes: 514854
num_examples: 2070
- name: appreviews
num_bytes: 2414054
num_examples: 8000
- name: hateoffensive
num_bytes: 493480
num_examples: 2586
- name: trueteacher
num_bytes: 24821652
num_examples: 17910
- name: spam
num_bytes: 292810
num_examples: 2070
- name: wikitoxic_toxicaggregated
num_bytes: 9026954
num_examples: 20000
- name: wikitoxic_obscene
num_bytes: 7951550
num_examples: 17382
- name: wikitoxic_identityhate
num_bytes: 5734460
num_examples: 11424
- name: wikitoxic_threat
num_bytes: 5174652
num_examples: 10422
- name: wikitoxic_insult
num_bytes: 7364528
num_examples: 16854
- name: manifesto
num_bytes: 417565056
num_examples: 953008
- name: capsotu
num_bytes: 24646828
num_examples: 70455
download_size: 10536386
dataset_size: 1057482061
---
# Dataset Card for "dataset_test_disaggregated_nli"
Dataset for testing a universal classifier. Additional information and training code available here: https://github.com/MoritzLaurer/zeroshot-classifier
提供机构:
MoritzLaurer
原始信息汇总
数据集概述
数据集配置
- 默认配置:包含多个数据文件,每个文件对应不同的数据分割。
数据文件列表
- mnli_m:路径为
data/mnli_m-* - mnli_mm:路径为
data/mnli_mm-* - fevernli:路径为
data/fevernli-* - anli_r1:路径为
data/anli_r1-* - anli_r2:路径为
data/anli_r2-* - anli_r3:路径为
data/anli_r3-* - wanli:路径为
data/wanli-* - lingnli:路径为
data/lingnli-* - wellformedquery:路径为
data/wellformedquery-* - rottentomatoes:路径为
data/rottentomatoes-* - amazonpolarity:路径为
data/amazonpolarity-* - imdb:路径为
data/imdb-* - yelpreviews:路径为
data/yelpreviews-* - hatexplain:路径为
data/hatexplain-* - massive:路径为
data/massive-* - banking77:路径为
data/banking77-* - emotiondair:路径为
data/emotiondair-* - emocontext:路径为
data/emocontext-* - empathetic:路径为
data/empathetic-* - agnews:路径为
data/agnews-* - yahootopics:路径为
data/yahootopics-* - biasframes_sex:路径为
data/biasframes_sex-* - biasframes_offensive:路径为
data/biasframes_offensive-* - biasframes_intent:路径为
data/biasframes_intent-* - financialphrasebank:路径为
data/financialphrasebank-* - appreviews:路径为
data/appreviews-* - hateoffensive:路径为
data/hateoffensive-* - trueteacher:路径为
data/trueteacher-* - spam:路径为
data/spam-* - wikitoxic_toxicaggregated:路径为
data/wikitoxic_toxicaggregated-* - wikitoxic_obscene:路径为
data/wikitoxic_obscene-* - wikitoxic_identityhate:路径为
data/wikitoxic_identityhate-* - wikitoxic_threat:路径为
data/wikitoxic_threat-* - wikitoxic_insult:路径为
data/wikitoxic_insult-* - manifesto:路径为
data/manifesto-* - capsotu:路径为
data/capsotu-*
数据集信息
特征
- text:数据类型为
string - hypothesis:数据类型为
string - labels:数据类型为
class_label,包含两个类别:entailment和not_entailment - task_name:数据类型为
string - label_text:数据类型为
string
数据分割
- mnli_m:2055427 字节,9815 个样本
- mnli_mm:2181179 字节,9832 个样本
- fevernli:7532028 字节,19652 个样本
- anli_r1:433064 字节,1000 个样本
- anli_r2:432927 字节,1000 个样本
- anli_r3:501290 字节,1200 个样本
- wanli:940472 字节,5000 个样本
- lingnli:1078241 字节,4893 个样本
- wellformedquery:815799 字节,5934 个样本
- rottentomatoes:493664 字节,2132 个样本
- amazonpolarity:10798222 字节,20000 个样本
- imdb:27862150 字节,20000 个样本
- yelpreviews:15688830 字节,20000 个样本
- hatexplain:710204 字节,2922 个样本
- massive:23911774 字节,175466 个样本
- banking77:40018400 字节,221760 个样本
- emotiondair:2202560 字节,12000 个样本
- emocontext:3575972 字节,22036 个样本
- empathetic:52139926 字节,81344 个样本
- agnews:9630696 字节,30400 个样本
- yahootopics:343270530 字节,500000 个样本
- biasframes_sex:1830030 字节,8808 个样本
- biasframes_offensive:1785704 字节,7676 个样本
- biasframes_intent:1592094 字节,7296 个样本
- financialphrasebank:514854 字节,2070 个样本
- appreviews:2414054 字节,8000 个样本
- hateoffensive:493480 字节,2586 个样本
- trueteacher:24821652 字节,17910 个样本
- spam:292810 字节,2070 个样本
- wikitoxic_toxicaggregated:9026954 字节,20000 个样本
- wikitoxic_obscene:7951550 字节,17382 个样本
- wikitoxic_identityhate:5734460 字节,11424 个样本
- wikitoxic_threat:5174652 字节,10422 个样本
- wikitoxic_insult:7364528 字节,16854 个样本
- manifesto:417565056 字节,953008 个样本
- capsotu:24646828 字节,70455 个样本
数据集大小
- 下载大小:10536386 字节
- 数据集大小:1057482061 字节



