nattiey1/diverse-unit-QA
收藏Hugging Face2023-05-31 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/nattiey1/diverse-unit-QA
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- question-answering
size_categories:
- 100K<n<1M
---
# Dataset Card for DUQA
## Table of Contents
- [Dataset Description](#dataset-description)
* [Abstract](#abstract)
* [Languages](#languages)
- [Dataset Structure](#dataset-structure)
* [Data Instances](#data-instances)
* [Data Fields](#data-fields)
- [Data Statistics](#data-statistics)
- [Dataset Creation](#dataset-creation)
* [Curation Rationale](#curation-rationale)
* [Source Data](#source-data)
* [Annotations](#annotations)
- [Considerations for Using the Data](#considerations-for-using-the-data)
* [Discussion of Social Impact and Biases](#discussion-of-social-impact-and-biases)
* [Other Known Limitations](#other-known-limitations)
- [Additional Information](#additional-information)
* [Dataset Curators](#dataset-curators)
* [Licensing Information](#licensing-information)
* [Citation Information](#citation-information)
## Dataset Description
### Abstract
DUQA is a dataset for single-step unit conversion questions. It comes in three sizes, ”DUQA10k”, ”DUQA100k” and ”DUQA1M”, with 10,000, 100,000 and 1,000,000 entries respectively. Each size contains a mixture of basic and complex conversion questions, including simple conversion, multiple answer, max/min, argmax/argmin, and noisy/q-noisy questions. The complexity level varies based on the amount of information present in the sentence and the number of reasoning steps required to calculate a correct answer.
### Languages
The text in the dataset is in English.
## Dataset Structure
### Data Instances
A single instance in the dataset consists of a question related to a single-step unit conversion problem, along with its corresponding correct answer.
### Data Fields
The dataset contains fields for the question, answer, and additional context about the question, along with multiple choices answers.
## Data Statistics
The dataset comes in three sizes, with 10,000, 100,000 and 1,000,000 entries respectively.
## Dataset Creation
### Curation Rationale
The dataset is curated to help machine learning models understand and perform single-step unit conversions. This ability is essential for many real-world applications, including but not limited to physical sciences, engineering, and data analysis tasks.
### Source Data
The source data for the dataset is generated using a Python library provided with the dataset, which can create new datasets from a list of templates.
### Annotations
The dataset does not contain any annotations.
## Considerations for Using the Data
### Discussion of Social Impact and Biases
The dataset is neutral and does not contain any explicit biases or social implications as it deals primarily with mathematical conversion problems.
### Other Known Limitations
The complexity of the questions is limited to single-step unit conversions. It does not cover multi-step or more complex unit conversion problems.
## Additional Information
### Dataset Curators
The dataset was created by a team of researchers. More information might be needed to provide specific names or organizations.
### Licensing Information
The licensing information for this dataset is not provided. Please consult the dataset provider for more details.
### Citation Information
The citation information for this dataset is not provided. Please consult the dataset provider for more details.
提供机构:
nattiey1
原始信息汇总
数据集概述
数据集描述
抽象
DUQA是一个用于单步单位转换问题的数据集,包含三种大小:“DUQA10k”、“DUQA100k”和“DUQA1M”,分别包含10,000、100,000和1,000,000个条目。每个大小都包含基本和复杂的转换问题,包括简单转换、多答案、最大/最小、argmax/argmin和噪声/q-噪声问题。问题的复杂度根据句子中信息量和计算正确答案所需的推理步骤数量而变化。
语言
数据集中的文本为英语。
数据集结构
数据实例
数据集中的单个实例包括一个与单步单位转换问题相关的问題及其对应的正确答案。
数据字段
数据集包含问题、答案和关于问题的额外上下文信息,以及多个选择答案的字段。
数据统计
数据集有三种大小,分别包含10,000、100,000和1,000,000个条目。
数据集创建
精选理由
数据集旨在帮助机器学习模型理解和执行单步单位转换。这种能力对许多实际应用至关重要,包括物理科学、工程和数据分析任务。
源数据
数据集的源数据是通过随数据集提供的Python库生成的,该库可以从一系列模板中创建新数据集。
注释
数据集不包含任何注释。
使用数据注意事项
社会影响和偏见讨论
数据集是中性的,不包含任何明确的偏见或社会影响,因为它主要涉及数学转换问题。
其他已知限制
问题的复杂度限于单步单位转换,不包括多步或更复杂的单位转换问题。



