five

2030NLP/SpaCE2021

收藏
Hugging Face2023-04-03 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/2030NLP/SpaCE2021
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - zh task_categories: - text-classification # - feature-extraction task_ids: # - d - acceptability-classification - natural-language-inference license: cc-by-nc-sa-4.0 pretty_name: space21 size_categories: - 10K<n<100K annotations_creators: - crowdsourced - expert-generated - machine-generated source_datasets: - ccl dataset_info: - config_name: task1 features: - name: qID dtype: string - name: context dtype: string - name: judge1 dtype: bool splits: - name: train num_bytes: 1470413 num_examples: 4237 - name: validation num_bytes: 321061 num_examples: 806 - name: test num_bytes: 263854 num_examples: 794 download_size: 2373041 dataset_size: 2055328 - config_name: task2 features: - name: qID dtype: string - name: context dtype: string - name: reason dtype: string - name: judge2 dtype: bool splits: - name: train num_bytes: 2586476 num_examples: 5989 - name: validation num_bytes: 712348 num_examples: 2088 - name: test num_bytes: 773393 num_examples: 1952 download_size: 4607294 dataset_size: 4072217 - config_name: task3 features: - name: qID dtype: string - name: context dtype: string - name: reason dtype: string - name: judge1 dtype: bool - name: judge2 dtype: bool splits: - name: validation num_bytes: 539209 num_examples: 1203 - name: test num_bytes: 445760 num_examples: 1167 download_size: 1110504 dataset_size: 984969 --- # Dataset Card for SpaCE2021 ## Dataset Description - **Homepage:** http://ccl.pku.edu.cn:8084/SpaCE2021/ - **Repository:** https://github.com/2030NLP/SpaCE2021 - **Paper:** [詹卫东、孙春晖、岳朋雪、唐乾桐、秦梓巍,2022,空间语义理解能力评测任务设计的新思路——SpaCE2021数据集的研制,《语言文字应用》2022年第2期(总第122期),pp.99-110。](https://yyyy.cbpt.cnki.net/WKC/WebPublication/paperDigest.aspx?paperID=c66cca51-7783-430e-abf1-28f6c28c49f6) - **Leaderboard:** https://github.com/2030NLP/SpaCE2021 - **Point of Contact:** sc_eval@163.com ### Dataset Summary This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1). ### Supported Tasks and Leaderboards [More Information Needed] ### Languages Chinese ## Dataset Structure ### Data Instances [More Information Needed] ### Data Fields [More Information Needed] ### Data Splits [More Information Needed] ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information [More Information Needed] ### Citation Information [More Information Needed] ### Contributions [More Information Needed]
提供机构:
2030NLP
原始信息汇总

数据集概述

基本信息

  • 语言: 中文
  • 任务类别:
    • 文本分类
    • 自然语言推理
  • 任务ID:
    • acceptability-classification
    • natural-language-inference
  • 许可证: cc-by-nc-sa-4.0
  • 数据集名称: space21
  • 数据集大小: 10K<n<100K
  • 注释创建者:
    • 众包
    • 专家生成
    • 机器生成
  • 源数据集: ccl

数据集配置

  1. 任务1

    • 特征:
      • qID: 字符串
      • context: 字符串
      • judge1: 布尔值
    • 分割:
      • 训练集: 4237个样本,1470413字节
      • 验证集: 806个样本,321061字节
      • 测试集: 794个样本,263854字节
    • 下载大小: 2373041字节
    • 数据集大小: 2055328字节
  2. 任务2

    • 特征:
      • qID: 字符串
      • context: 字符串
      • reason: 字符串
      • judge2: 布尔值
    • 分割:
      • 训练集: 5989个样本,2586476字节
      • 验证集: 2088个样本,712348字节
      • 测试集: 1952个样本,773393字节
    • 下载大小: 4607294字节
    • 数据集大小: 4072217字节
  3. 任务3

    • 特征:
      • qID: 字符串
      • context: 字符串
      • reason: 字符串
      • judge1: 布尔值
      • judge2: 布尔值
    • 分割:
      • 验证集: 1203个样本,539209字节
      • 测试集: 1167个样本,445760字节
    • 下载大小: 1110504字节
    • 数据集大小: 984969字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作