KaleidoSG/Helix
收藏Hugging Face2023-09-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/KaleidoSG/Helix
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- question-answering
- translation
- summarization
- text-generation
- conversational
language:
- en
tags:
- code
- airoboros
- language
- merge
- gpt
pretty_name: helix
size_categories:
- 100K<n<1M
---
# Helix Dataset for Questioning and Instructing (QI)
## Description
The Helix dataset is a specialized collection of data tailored for Questioning and Instructing (QI) tasks. It is created by merging all the Airoboros datasets and incorporating one RosettaCode dataset, with a primary focus on supporting QI research and applications.
## Dataset Details
- **Source Datasets**: Airoboros datasets (various sources), RosettaCode dataset
- **Merging Script**: The merging of these datasets was performed using the `bowie.py` script, which is included in this repository. The script facilitates the formatting and integration of the datasets to create the Helix dataset optimized for QI tasks.
## Usage
The Helix dataset is particularly suited for researchers and developers working on QI tasks, including:
- Developing QI systems that can understand and respond to natural language queries and instructions.
- Training and evaluating machine learning models for QI applications.
- Benchmarking QI algorithms and techniques.
- Investigating the intersection of natural language understanding and instructional responses.
## License
Please refer to the individual licenses of the source datasets for specific licensing information. Ensure compliance with the respective licenses when using the Helix dataset.
## Citation
If you use the Helix dataset for QI research or projects, please consider citing it using the appropriate citation format for each of the source datasets and the `bowie.py` script.
```
Marcus. 2023. Helix Dataset for Questioning and Instructing (QI). Helix. Self-published. https://huggingface.co/datasets/KaleidoSG/Helix
```
## Acknowledgments
We express our gratitude to the creators and maintainers of the Airoboros datasets and the RosettaCode dataset for their valuable contributions to this specialized dataset for Questioning and Instructing (QI) tasks.
提供机构:
KaleidoSG
原始信息汇总
Helix Dataset for Questioning and Instructing (QI)
描述
Helix 数据集是一个专门为提问和指导(QI)任务定制的数据集合。它通过合并所有 Airoboros 数据集并加入一个 RosettaCode 数据集创建,主要用于支持 QI 研究和应用。
数据集详情
- 来源数据集: Airoboros 数据集(多个来源),RosettaCode 数据集
- 合并脚本: 这些数据集的合并是通过
bowie.py脚本完成的,该脚本包含在此仓库中。该脚本有助于格式化和整合数据集,以创建优化的 Helix 数据集用于 QI 任务。
用途
Helix 数据集特别适合从事 QI 任务的研究人员和开发者,包括:
- 开发能够理解和响应自然语言查询和指令的 QI 系统。
- 训练和评估用于 QI 应用的机器学习模型。
- 基准测试 QI 算法和技术。
- 研究自然语言理解和指令响应的交叉点。
许可证
请参考来源数据集的个别许可证以获取具体的许可信息。在使用 Helix 数据集时,请确保遵守各自的许可证。
引用
如果您在 QI 研究或项目中使用 Helix 数据集,请考虑使用适当的引用格式引用每个来源数据集和 bowie.py 脚本。
Marcus. 2023. Helix Dataset for Questioning and Instructing (QI). Helix. Self-published. https://huggingface.co/datasets/KaleidoSG/Helix
致谢
我们感谢 Airoboros 数据集和 RosettaCode 数据集的创建者和维护者,他们为这个专门用于提问和指导(QI)任务的数据集做出了宝贵的贡献。



