art
收藏魔搭社区2025-08-15 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/allenai/art
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for "art"
## Table of Contents
- [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary)
- [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards)
- [Languages](#languages)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Data Fields](#data-fields)
- [Data Splits](#data-splits)
- [Dataset Creation](#dataset-creation)
- [Curation Rationale](#curation-rationale)
- [Source Data](#source-data)
- [Annotations](#annotations)
- [Personal and Sensitive Information](#personal-and-sensitive-information)
- [Considerations for Using the Data](#considerations-for-using-the-data)
- [Social Impact of Dataset](#social-impact-of-dataset)
- [Discussion of Biases](#discussion-of-biases)
- [Other Known Limitations](#other-known-limitations)
- [Additional Information](#additional-information)
- [Dataset Curators](#dataset-curators)
- [Licensing Information](#licensing-information)
- [Citation Information](#citation-information)
- [Contributions](#contributions)
## Dataset Description
- **Homepage:** [https://leaderboard.allenai.org/anli/submissions/get-started](https://leaderboard.allenai.org/anli/submissions/get-started)
- **Repository:** https://github.com/allenai/abductive-commonsense-reasoning
- **Paper:** [Abductive Commonsense Reasoning](https://arxiv.org/abs/1908.05739)
- **Point of Contact:** [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
- **Size of downloaded dataset files:** 5.12 MB
- **Size of the generated dataset:** 34.36 MB
- **Total amount of disk used:** 39.48 MB
### Dataset Summary
ART consists of over 20k commonsense narrative contexts and 200k explanations.
The Abductive Natural Language Inference Dataset from AI2.
### Supported Tasks and Leaderboards
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Languages
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
## Dataset Structure
### Data Instances
#### anli
- **Size of downloaded dataset files:** 5.12 MB
- **Size of the generated dataset:** 34.36 MB
- **Total amount of disk used:** 39.48 MB
An example of 'train' looks as follows.
```
{
"hypothesis_1": "Chad's car had all sorts of other problems besides alignment.",
"hypothesis_2": "Chad's car had all sorts of benefits other than being sexy.",
"label": 1,
"observation_1": "Chad went to get the wheel alignment measured on his car.",
"observation_2": "The mechanic provided a working alignment with new body work."
}
```
### Data Fields
The data fields are the same among all splits.
#### anli
- `observation_1`: a `string` feature.
- `observation_2`: a `string` feature.
- `hypothesis_1`: a `string` feature.
- `hypothesis_2`: a `string` feature.
- `label`: a classification label, with possible values including `0` (0), `1` (1), `2` (2).
### Data Splits
|name|train |validation|
|----|-----:|---------:|
|anli|169654| 1532|
## Dataset Creation
### Curation Rationale
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Source Data
#### Initial Data Collection and Normalization
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
#### Who are the source language producers?
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Annotations
#### Annotation process
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
#### Who are the annotators?
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Personal and Sensitive Information
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Discussion of Biases
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Other Known Limitations
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
## Additional Information
### Dataset Curators
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Licensing Information
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### Citation Information
```
@inproceedings{Bhagavatula2020Abductive,
title={Abductive Commonsense Reasoning},
author={Chandra Bhagavatula and Ronan Le Bras and Chaitanya Malaviya and Keisuke Sakaguchi and Ari Holtzman and Hannah Rashkin and Doug Downey and Wen-tau Yih and Yejin Choi},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://openreview.net/forum?id=Byg1v1HKDB}
}
```
### Contributions
Thanks to [@patrickvonplaten](https://github.com/patrickvonplaten), [@thomwolf](https://github.com/thomwolf), [@mariamabarham](https://github.com/mariamabarham), [@lewtun](https://github.com/lewtun), [@lhoestq](https://github.com/lhoestq) for adding this dataset.
# "art"数据集卡片
## 目录
- [数据集描述](#数据集描述)
- [数据集概述](#数据集概述)
- [支持任务与评测榜单](#支持任务与评测榜单)
- [语言情况](#语言情况)
- [数据集结构](#数据集结构)
- [数据实例](#数据实例)
- [数据字段](#数据字段)
- [数据划分](#数据划分)
- [数据集构建](#数据集构建)
- [构建动机](#构建动机)
- [源数据](#源数据)
- [标注流程](#标注流程)
- [个人与敏感信息](#个人与敏感信息)
- [数据集使用注意事项](#数据集使用注意事项)
- [数据集的社会影响](#数据集的社会影响)
- [偏见讨论](#偏见讨论)
- [其他已知局限性](#其他已知局限性)
- [附加信息](#附加信息)
- [数据集维护者](#数据集维护者)
- [许可信息](#许可信息)
- [引用信息](#引用信息)
- [贡献致谢](#贡献致谢)
## 数据集描述
- **主页:** [https://leaderboard.allenai.org/anli/submissions/get-started](https://leaderboard.allenai.org/anli/submissions/get-started)
- **代码仓库:** https://github.com/allenai/abductive-commonsense-reasoning
- **相关论文:** [Abductive Commonsense Reasoning](https://arxiv.org/abs/1908.05739)
- **联系方式:** [需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
- **下载数据集文件大小:** 5.12 MB
- **生成后数据集大小:** 34.36 MB
- **总磁盘占用量:** 39.48 MB
### 数据集概述
ART数据集包含超过20,000条常识叙事上下文与200,000条解释文本,是由艾伦人工智能研究所(Allen Institute for AI,AI2)推出的溯因自然语言推理(Abductive Natural Language Inference, ANLI)数据集。
### 支持任务与评测榜单
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 语言情况
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
## 数据集结构
### 数据实例
#### anli
- **下载数据集文件大小:** 5.12 MB
- **生成后数据集大小:** 34.36 MB
- **总磁盘占用量:** 39.48 MB
训练集的一个示例如下:
{
"hypothesis_1": "查德的汽车除了四轮定位之外还存在各类其他故障。",
"hypothesis_2": "查德的汽车除了外观亮眼之外还有诸多优点。",
"label": 1,
"observation_1": "查德前往修理厂检测其车辆的四轮定位参数。",
"observation_2": "维修师完成了标准的四轮定位调整,并完成了全新的车身修复作业。"
}
### 数据字段
所有数据划分下的字段均保持一致。
#### anli
- `observation_1`: 字符串类型特征
- `observation_2`: 字符串类型特征
- `hypothesis_1`: 字符串类型特征
- `hypothesis_2`: 字符串类型特征
- `label`: 分类标签,可选取值包括0、1、2
### 数据划分
| 划分名称 | 训练集样本数 | 验证集样本数 |
| ------- | ----------: | ----------: |
| anli | 169654 | 1532 |
## 数据集构建
### 构建动机
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 源数据
#### 初始数据收集与标准化
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
#### 源语言生成者是谁?
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 标注流程
#### 标注过程
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
#### 标注人员构成
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 个人与敏感信息
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
## 数据集使用注意事项
### 数据集的社会影响
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 偏见讨论
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 其他已知局限性
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
## 附加信息
### 数据集维护者
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 许可信息
[需要更多信息](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 引用信息
@inproceedings{Bhagavatula2020Abductive,
title={Abductive Commonsense Reasoning},
author={Chandra Bhagavatula and Ronan Le Bras and Chaitanya Malaviya and Keisuke Sakaguchi and Ari Holtzman and Hannah Rashkin and Doug Downey and Wen-tau Yih and Yejin Choi},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://openreview.net/forum?id=Byg1v1HKDB}
}
### 贡献致谢
感谢[@patrickvonplaten](https://github.com/patrickvonplaten)、[@thomwolf](https://github.com/thomwolf)、[@mariamabarham](https://github.com/mariamabarham)、[@lewtun](https://github.com/lewtun)、[@lhoestq](https://github.com/lhoestq) 为本数据集的添加工作。
提供机构:
maas
创建时间:
2025-05-29



