emozilla/qasper-pruned-llama-gptneox-8k
收藏Hugging Face2023-04-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/emozilla/qasper-pruned-llama-gptneox-8k
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: string
- name: title
dtype: string
- name: abstract
dtype: string
- name: full_text
sequence:
- name: section_name
dtype: string
- name: paragraphs
list: string
- name: qas
sequence:
- name: question
dtype: string
- name: question_id
dtype: string
- name: nlp_background
dtype: string
- name: topic_background
dtype: string
- name: paper_read
dtype: string
- name: search_query
dtype: string
- name: question_writer
dtype: string
- name: answers
sequence:
- name: answer
struct:
- name: unanswerable
dtype: bool
- name: extractive_spans
sequence: string
- name: yes_no
dtype: bool
- name: free_form_answer
dtype: string
- name: evidence
sequence: string
- name: highlighted_evidence
sequence: string
- name: annotation_id
dtype: string
- name: worker_id
dtype: string
- name: figures_and_tables
sequence:
- name: caption
dtype: string
- name: file
dtype: string
splits:
- name: train
num_bytes: 24427288.12162162
num_examples: 762
- name: validation
num_bytes: 9089856.918149466
num_examples: 258
- name: test
num_bytes: 13925108.735576924
num_examples: 374
download_size: 20505240
dataset_size: 47442253.77534801
---
# Dataset Card for "emozillaqasper-pruned-llama-gptneox-8k"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
emozilla
原始信息汇总
数据集概述
数据集特征
- id:字符串类型
- title:字符串类型
- abstract:字符串类型
- full_text:序列类型,包含:
- section_name:字符串类型
- paragraphs:字符串列表
- qas:序列类型,包含:
- question:字符串类型
- question_id:字符串类型
- nlp_background:字符串类型
- topic_background:字符串类型
- paper_read:字符串类型
- search_query:字符串类型
- question_writer:字符串类型
- answers:序列类型,包含:
- answer:结构类型,包含:
- unanswerable:布尔类型
- extractive_spans:字符串序列
- yes_no:布尔类型
- free_form_answer:字符串类型
- evidence:字符串序列
- highlighted_evidence:字符串序列
- annotation_id:字符串类型
- worker_id:字符串类型
- answer:结构类型,包含:
- figures_and_tables:序列类型,包含:
- caption:字符串类型
- file:字符串类型
数据集分割
- train:762个样本,占用24427288.12162162字节
- validation:258个样本,占用9089856.918149466字节
- test:374个样本,占用13925108.735576924字节
数据集大小
- 下载大小:20505240字节
- 数据集大小:47442253.77534801字节



