genta-tech/boolq-id
收藏Hugging Face2023-05-09 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/genta-tech/boolq-id
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: question
dtype: string
- name: passage
dtype: string
- name: label
dtype: int64
splits:
- name: train
num_bytes: 4300375
num_examples: 9427
download_size: 2503993
dataset_size: 4300375
license: cc-by-sa-4.0
task_categories:
- text-classification
- feature-extraction
language:
- id
tags:
- super_glue
- text similarity
size_categories:
- 10K<n<100K
---
# Dataset Card for "boolq-id"
This dataset is a translated version of qnli dataset from [super_glue](https://huggingface.co/datasets/super_glue) dataset.
# Citing & Authors
```
@inproceedings{clark2019boolq,
title={BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions},
author={Clark, Christopher and Lee, Kenton and Chang, Ming-Wei, and Kwiatkowski, Tom and Collins, Michael, and Toutanova, Kristina},
booktitle={NAACL},
year={2019}
}
@article{wang2019superglue,
title={SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems},
author={Wang, Alex and Pruksachatkun, Yada and Nangia, Nikita and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel R},
journal={arXiv preprint arXiv:1905.00537},
year={2019}
}
```
提供机构:
genta-tech
原始信息汇总
数据集概述
数据集名称
"boolq-id"
数据集特征
- question: 数据类型为字符串
- passage: 数据类型为字符串
- label: 数据类型为整数(int64)
数据集分割
- train: 包含9427个样本,总大小为4300375字节
数据集大小
- 下载大小: 2503993字节
- 数据集大小: 4300375字节
许可证
cc-by-sa-4.0
任务类别
- 文本分类
- 特征提取
语言
- 印尼语(id)
标签
- super_glue
- 文本相似性
大小类别
- 10K<n<100K



