five

emozilla/qasper-pruned-llama-gptneox-8k

收藏
Hugging Face2023-04-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/emozilla/qasper-pruned-llama-gptneox-8k
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: id dtype: string - name: title dtype: string - name: abstract dtype: string - name: full_text sequence: - name: section_name dtype: string - name: paragraphs list: string - name: qas sequence: - name: question dtype: string - name: question_id dtype: string - name: nlp_background dtype: string - name: topic_background dtype: string - name: paper_read dtype: string - name: search_query dtype: string - name: question_writer dtype: string - name: answers sequence: - name: answer struct: - name: unanswerable dtype: bool - name: extractive_spans sequence: string - name: yes_no dtype: bool - name: free_form_answer dtype: string - name: evidence sequence: string - name: highlighted_evidence sequence: string - name: annotation_id dtype: string - name: worker_id dtype: string - name: figures_and_tables sequence: - name: caption dtype: string - name: file dtype: string splits: - name: train num_bytes: 24427288.12162162 num_examples: 762 - name: validation num_bytes: 9089856.918149466 num_examples: 258 - name: test num_bytes: 13925108.735576924 num_examples: 374 download_size: 20505240 dataset_size: 47442253.77534801 --- # Dataset Card for "emozillaqasper-pruned-llama-gptneox-8k" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
emozilla
原始信息汇总

数据集概述

数据集特征

  • id:字符串类型
  • title:字符串类型
  • abstract:字符串类型
  • full_text:序列类型,包含:
    • section_name:字符串类型
    • paragraphs:字符串列表
  • qas:序列类型,包含:
    • question:字符串类型
    • question_id:字符串类型
    • nlp_background:字符串类型
    • topic_background:字符串类型
    • paper_read:字符串类型
    • search_query:字符串类型
    • question_writer:字符串类型
    • answers:序列类型,包含:
      • answer:结构类型,包含:
        • unanswerable:布尔类型
        • extractive_spans:字符串序列
        • yes_no:布尔类型
        • free_form_answer:字符串类型
        • evidence:字符串序列
        • highlighted_evidence:字符串序列
      • annotation_id:字符串类型
      • worker_id:字符串类型
  • figures_and_tables:序列类型,包含:
    • caption:字符串类型
    • file:字符串类型

数据集分割

  • train:762个样本,占用24427288.12162162字节
  • validation:258个样本,占用9089856.918149466字节
  • test:374个样本,占用13925108.735576924字节

数据集大小

  • 下载大小:20505240字节
  • 数据集大小:47442253.77534801字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作