2030NLP/SpaCE2021
收藏Hugging Face2023-04-03 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/2030NLP/SpaCE2021
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- zh
task_categories:
- text-classification
# - feature-extraction
task_ids:
# - d
- acceptability-classification
- natural-language-inference
license: cc-by-nc-sa-4.0
pretty_name: space21
size_categories:
- 10K<n<100K
annotations_creators:
- crowdsourced
- expert-generated
- machine-generated
source_datasets:
- ccl
dataset_info:
- config_name: task1
features:
- name: qID
dtype: string
- name: context
dtype: string
- name: judge1
dtype: bool
splits:
- name: train
num_bytes: 1470413
num_examples: 4237
- name: validation
num_bytes: 321061
num_examples: 806
- name: test
num_bytes: 263854
num_examples: 794
download_size: 2373041
dataset_size: 2055328
- config_name: task2
features:
- name: qID
dtype: string
- name: context
dtype: string
- name: reason
dtype: string
- name: judge2
dtype: bool
splits:
- name: train
num_bytes: 2586476
num_examples: 5989
- name: validation
num_bytes: 712348
num_examples: 2088
- name: test
num_bytes: 773393
num_examples: 1952
download_size: 4607294
dataset_size: 4072217
- config_name: task3
features:
- name: qID
dtype: string
- name: context
dtype: string
- name: reason
dtype: string
- name: judge1
dtype: bool
- name: judge2
dtype: bool
splits:
- name: validation
num_bytes: 539209
num_examples: 1203
- name: test
num_bytes: 445760
num_examples: 1167
download_size: 1110504
dataset_size: 984969
---
# Dataset Card for SpaCE2021
## Dataset Description
- **Homepage:** http://ccl.pku.edu.cn:8084/SpaCE2021/
- **Repository:** https://github.com/2030NLP/SpaCE2021
- **Paper:** [詹卫东、孙春晖、岳朋雪、唐乾桐、秦梓巍,2022,空间语义理解能力评测任务设计的新思路——SpaCE2021数据集的研制,《语言文字应用》2022年第2期(总第122期),pp.99-110。](https://yyyy.cbpt.cnki.net/WKC/WebPublication/paperDigest.aspx?paperID=c66cca51-7783-430e-abf1-28f6c28c49f6)
- **Leaderboard:** https://github.com/2030NLP/SpaCE2021
- **Point of Contact:** sc_eval@163.com
### Dataset Summary
This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1).
### Supported Tasks and Leaderboards
[More Information Needed]
### Languages
Chinese
## Dataset Structure
### Data Instances
[More Information Needed]
### Data Fields
[More Information Needed]
### Data Splits
[More Information Needed]
## Dataset Creation
### Curation Rationale
[More Information Needed]
### Source Data
#### Initial Data Collection and Normalization
[More Information Needed]
#### Who are the source language producers?
[More Information Needed]
### Annotations
#### Annotation process
[More Information Needed]
#### Who are the annotators?
[More Information Needed]
### Personal and Sensitive Information
[More Information Needed]
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed]
### Discussion of Biases
[More Information Needed]
### Other Known Limitations
[More Information Needed]
## Additional Information
### Dataset Curators
[More Information Needed]
### Licensing Information
[More Information Needed]
### Citation Information
[More Information Needed]
### Contributions
[More Information Needed]
提供机构:
2030NLP
原始信息汇总
数据集概述
基本信息
- 语言: 中文
- 任务类别:
- 文本分类
- 自然语言推理
- 任务ID:
- acceptability-classification
- natural-language-inference
- 许可证: cc-by-nc-sa-4.0
- 数据集名称: space21
- 数据集大小: 10K<n<100K
- 注释创建者:
- 众包
- 专家生成
- 机器生成
- 源数据集: ccl
数据集配置
-
任务1
- 特征:
- qID: 字符串
- context: 字符串
- judge1: 布尔值
- 分割:
- 训练集: 4237个样本,1470413字节
- 验证集: 806个样本,321061字节
- 测试集: 794个样本,263854字节
- 下载大小: 2373041字节
- 数据集大小: 2055328字节
- 特征:
-
任务2
- 特征:
- qID: 字符串
- context: 字符串
- reason: 字符串
- judge2: 布尔值
- 分割:
- 训练集: 5989个样本,2586476字节
- 验证集: 2088个样本,712348字节
- 测试集: 1952个样本,773393字节
- 下载大小: 4607294字节
- 数据集大小: 4072217字节
- 特征:
-
任务3
- 特征:
- qID: 字符串
- context: 字符串
- reason: 字符串
- judge1: 布尔值
- judge2: 布尔值
- 分割:
- 验证集: 1203个样本,539209字节
- 测试集: 1167个样本,445760字节
- 下载大小: 1110504字节
- 数据集大小: 984969字节
- 特征:



