FINNUMBER/BQA_ORIGINAL
收藏Hugging Face2023-12-25 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/FINNUMBER/BQA_ORIGINAL
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
dataset_info:
features:
- name: doc_id
dtype: string
- name: doc_title
dtype: string
- name: doc_source
dtype: string
- name: doc_published
dtype: int64
- name: created
dtype: string
- name: doc_class
struct:
- name: class
dtype: string
- name: code
dtype: string
- name: paragraphs
list:
- name: context
dtype: string
- name: context_id
dtype: string
- name: qas
list:
- name: answer
struct:
- name: answer_end
dtype: 'null'
- name: answer_start
dtype: 'null'
- name: cell_coordinates
dtype: 'null'
- name: cell_text
dtype: 'null'
- name: clue_start
dtype: int64
- name: clue_text
dtype: string
- name: options
dtype: 'null'
- name: source
dtype: string
- name: text
dtype: string
- name: qa_type
dtype: int64
- name: question
dtype: string
- name: question_id
dtype: string
- name: tbs
dtype: 'null'
splits:
- name: train
num_bytes: 19830557
num_examples: 6838
- name: test
num_bytes: 399014
num_examples: 118
download_size: 8040601
dataset_size: 20229571
---
# Dataset Card for "BQA_ORIGINAL"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
The dataset named BQA_ORIGINAL includes training and test sets. Each document has multiple features such as document ID, title, source, publication time, creation time, document class (including class name and code), and a list of paragraphs. The paragraph list contains context, context ID, questions, and answer sets, which detail multiple attributes of the answers. The dataset provides the size and number of examples for both the training and test sets.
提供机构:
FINNUMBER
原始信息汇总
数据集概述
配置
- 默认配置:
- 训练数据:路径为
data/train-* - 测试数据:路径为
data/test-*
- 训练数据:路径为
数据特征
- 文档ID:类型为字符串
- 文档标题:类型为字符串
- 文档来源:类型为字符串
- 文档发布时间:类型为整数64位
- 创建时间:类型为字符串
- 文档分类:
- 分类:类型为字符串
- 代码:类型为字符串
- 段落:
- 上下文:类型为字符串
- 上下文ID:类型为字符串
- 问题与答案:
- 答案:
- 答案结束位置:类型为空
- 答案开始位置:类型为空
- 单元格坐标:类型为空
- 单元格文本:类型为空
- 线索开始位置:类型为整数64位
- 线索文本:类型为字符串
- 选项:类型为空
- 来源:类型为字符串
- 文本:类型为字符串
- 问题类型:类型为整数64位
- 问题:类型为字符串
- 问题ID:类型为字符串
- 答案:
- 表格:类型为空
数据分割
- 训练集:
- 字节数:19830557
- 样本数:6838
- 测试集:
- 字节数:399014
- 样本数:118
数据集大小
- 下载大小:8040601字节
- 数据集大小:20229571字节



