nasa-impact/nasa-smd-qa-benchmark
收藏Hugging Face2024-10-11 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/nasa-impact/nasa-smd-qa-benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc
task_categories:
- question-answering
language:
- en
tags:
- climate
- chemistry
- biology
- earth science
pretty_name: NASA-QA
---
# NASA-QA Benchmark
NASA SMD and IBM research developed **NASA-QA** benchmark, an extractive question answering task focused on the Earth science domain. First, 39 paragraphs from Earth science papers which appeared in AGU and AMS journals were sourced. Subject matter experts from NASA formulated questions and marked the corresponding answers in these paragraphs, resulting in a total of 117 question-answer pairs. The dataset is split into a training set of 90 pairs and a validation set of 27 pairs. The average length of the questions is 11 words, and the average length of the paragraphs is 150 words. The evaluation metric used for this task is F1 score, measuring the overlap between predicted and ground truth answers.
**Evaluation Metrics**

**Note**
This dataset is released in support of the training and evaluation of the encoder language model ["Indus"](https://huggingface.co/nasa-impact/nasa-smd-ibm-v0.1).
Accompanying paper can be found here: https://arxiv.org/abs/2405.10725
提供机构:
nasa-impact
原始信息汇总
数据集概述
基本信息
- 许可证: cc
- 任务类别: 问答
- 语言: 英语
- 标签: 气候、化学、生物学、地球科学
- 名称: NASA-QA
开发背景
NASA-QA是由NASA SMD和IBM研究开发的,专注于地球科学领域的抽取式问答任务。
数据来源与构成
- 数据来源: 从AGU和AMS期刊中选取的39篇地球科学论文的段落。
- 问题与答案: 由NASA的专家制定问题并标记答案,共形成117个问题-答案对。
- 数据集划分: 训练集包含90对,验证集包含27对。
- 问题与段落长度: 平均问题长度为11个词,平均段落长度为150个词。
评估指标
- 评估方法: 使用F1分数来衡量预测答案与真实答案之间的重叠度。
数据集用途
该数据集用于支持训练和评估编码器语言模型"Indus"。



