Step-Instruction-Gx
收藏魔搭社区2025-11-27 更新2025-02-01 收录
下载链接:
https://modelscope.cn/datasets/prithivMLmods/Step-Instruction-Gx
下载链接
链接失效反馈官方服务:
资源简介:
# Step-Instruction-GX Dataset
## Overview
The **Step-Instruction-GX** dataset is a collection of instructional and educational content designed to assist in various learning and decision-making tasks. It includes a wide range of questions and corresponding answers, covering topics from health tips to scientific concepts.
## Dataset Details
### Modalities
- **Text**: The dataset primarily contains text data in various formats.
### Formats
- **CSV**: The dataset is available in CSV format with 10K-100K entries.
- **Libraries**: Compatible with popular data processing libraries like `pandas`.
- **Croissant**: Additional support for Croissant format with an Apache 2.0 license.
### Dataset Statistics
- **Rows**: 52,002
- **Size**: 38.5 MB (original), 23 MB (auto-converted Parquet files)
## Dataset Structure
The dataset is split into a single training set with 52,000 rows. Each row contains a `question string` and an `output string` with the corresponding answer.
### Example Entries
| Question String | Output String |
|-----------------|---------------|
| Give three tips for staying healthy. | 1. Eat a balanced and nutritious diet; Make sure your meals are inclusive of a variety of fruits and vegetables, lean protein... |
| What are the three primary colors? | The three primary colors are red, blue, and yellow. These colors are called primary because they cannot be created by mixing other colors. |
| Describe the structure of an atom. | An atom is the basic building block of all matter and is made up of three types of particles: protons, neutrons, and electrons. |
## Usage
To use this dataset, you can download it directly from the Hugging Face platform. The dataset is compatible with various data processing tools and libraries, making it easy to integrate into your projects.
### Download Instructions
1. Visit the dataset page on Hugging Face.
2. Click on the "Use this dataset" button to download the files.
3. Load the dataset using your preferred tool (e.g., `pandas` for Python).
# Step-Instruction-GX 数据集
## 概览
**Step-Instruction-GX 数据集**是一套面向各类学习与决策任务设计的教学与教育内容合集,涵盖健康贴士、科学概念等多领域的问答对数据。
## 数据集详情
### 数据模态
- **文本**:本数据集以多格式文本数据为核心内容。
### 数据格式与兼容性
- **CSV格式**:数据集以CSV格式存储,包含1万至10万条数据条目。
- **库兼容性**:兼容`pandas`等主流数据处理库。
- **Croissant格式**:额外支持采用Apache 2.0开源许可的Croissant格式。
### 数据集统计信息
- **数据行数**:52,002条
- **文件大小**:原始版本38.5 MB,自动转换的Parquet文件版本23 MB。
## 数据集结构
本数据集仅包含一个训练集,共计52,000条数据行。每条数据均包含`question string`(问题字符串)与`output string`(对应答案字符串)两个字段。
### 示例条目
| 问题字符串 | 答案字符串 |
|-----------------|---------------|
| 请给出三条保持健康的小贴士。 | 1. 保持均衡营养的饮食;确保膳食包含各类蔬果、优质蛋白质…… |
| 三原色是什么? | 三原色为红、蓝、黄。此类颜色被称为原色,因其无法通过混合其他颜色得到。 |
| 请描述原子的结构。 | 原子是构成所有物质的基本单元,由质子、中子与电子三类粒子组成。 |
## 数据集使用方式
如需使用本数据集,可直接从Hugging Face平台下载。其兼容各类数据处理工具与库,便于集成至各类项目中。
### 下载指南
1. 访问Hugging Face平台上的数据集页面。
2. 点击“使用此数据集”按钮下载相关文件。
3. 使用您偏好的工具(例如Python的`pandas`库)加载数据集。
提供机构:
maas
创建时间:
2025-01-28



