chrishuber/kaggle_mnli

Name: chrishuber/kaggle_mnli
Creator: chrishuber
Published: 2022-04-23 19:19:52
License: 暂无描述

Hugging Face2022-04-23 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/chrishuber/kaggle_mnli

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for [Kaggle MNLI] ## Table of Contents - [Table of Contents](#table-of-contents) - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage: https://www.kaggle.com/c/multinli-matched-open-evaluation ** - **Repository: chrishuber/roberta-retrained-mlni ** - **Paper: Inference Detection in NLP Using the MultiNLI and SNLI Datasets** - **Leaderboard: 8** - **Point of Contact: chrish@sfsu.edu** ### Dataset Summary [These are the datasets posted to Kaggle for an inference detection NLP competition. Moving them here to use with Pytorch.] ### Supported Tasks and Leaderboards Provides train and validation data for sentence pairs with inference labels. [https://www.kaggle.com/competitions/multinli-matched-open-evaluation/leaderboard] [https://www.kaggle.com/competitions/multinli-mismatched-open-evaluation/leaderboard] ### Languages [JSON, Python] ## Dataset Structure ### Data Instances [More Information Needed] ### Data Fields [More Information Needed] ### Data Splits [More Information Needed] ## Dataset Creation ### Curation Rationale [Reposted from https://www.kaggle.com/c/multinli-matched-open-evaluation and https://www.kaggle.com/c/multinli-mismatched-open-evaluation] ### Source Data #### Initial Data Collection and Normalization [Please see the article at https://arxiv.org/abs/1704.05426 which discusses the creation of the MNLI dataset.] #### Who are the source language producers? [Please see the article at https://arxiv.org/abs/1704.05426 which discusses the creation of the MNLI dataset.] ### Annotations #### Annotation process [Crowdsourcing using MechanicalTurk.] #### Who are the annotators? [MechanicalTurk users.] ### Personal and Sensitive Information [None.] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [Kaggle] ### Licensing Information [More Information Needed] ### Citation Information [https://www.kaggle.com/c/multinli-matched-open-evaluation] [https://www.kaggle.com/c/multinli-mismatched-open-evaluation] ### Contributions Thanks to [@github-username](https://github.com/<github-username>) for adding this dataset.

提供机构：

chrishuber

原始信息汇总

数据集卡片 for [Kaggle MNLI]

数据集描述

数据集概述

这些数据集是为自然语言处理中的推理检测竞赛在Kaggle上发布的。将它们移至此处以便与Pytorch一起使用。

支持的任务和排行榜

提供带有推理标签的句子对的训练和验证数据。

语言

[JSON, Python]

数据集结构

数据实例

[更多信息需补充]

数据字段

[更多信息需补充]

数据分割

[更多信息需补充]

数据集创建

策划理由

从以下链接重新发布：https://www.kaggle.com/c/multinli-matched-open-evaluation 和 https://www.kaggle.com/c/multinli-mismatched-open-evaluation

源数据

初始数据收集和规范化

请参阅文章：https://arxiv.org/abs/1704.05426，该文章讨论了MNLI数据集的创建。

源语言生产者是谁？

请参阅文章：https://arxiv.org/abs/1704.05426，该文章讨论了MNLI数据集的创建。

注释

注释过程

使用MechanicalTurk进行众包。

注释者是谁？

MechanicalTurk用户。

个人和敏感信息

[无]

使用数据时的考虑

数据集的社会影响

[更多信息需补充]

偏见的讨论

[更多信息需补充]

其他已知限制

[更多信息需补充]

附加信息

数据集策展人

[Kaggle]

许可信息

[更多信息需补充]

引用信息

https://www.kaggle.com/c/multinli-matched-open-evaluation https://www.kaggle.com/c/multinli-mismatched-open-evaluation

贡献

感谢@github-username 添加此数据集。

5,000+

优质数据集

54 个

任务类型

进入经典数据集