five

KaleidoSG/Helix

收藏
Hugging Face2023-09-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/KaleidoSG/Helix
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - question-answering - translation - summarization - text-generation - conversational language: - en tags: - code - airoboros - language - merge - gpt pretty_name: helix size_categories: - 100K<n<1M --- # Helix Dataset for Questioning and Instructing (QI) ## Description The Helix dataset is a specialized collection of data tailored for Questioning and Instructing (QI) tasks. It is created by merging all the Airoboros datasets and incorporating one RosettaCode dataset, with a primary focus on supporting QI research and applications. ## Dataset Details - **Source Datasets**: Airoboros datasets (various sources), RosettaCode dataset - **Merging Script**: The merging of these datasets was performed using the `bowie.py` script, which is included in this repository. The script facilitates the formatting and integration of the datasets to create the Helix dataset optimized for QI tasks. ## Usage The Helix dataset is particularly suited for researchers and developers working on QI tasks, including: - Developing QI systems that can understand and respond to natural language queries and instructions. - Training and evaluating machine learning models for QI applications. - Benchmarking QI algorithms and techniques. - Investigating the intersection of natural language understanding and instructional responses. ## License Please refer to the individual licenses of the source datasets for specific licensing information. Ensure compliance with the respective licenses when using the Helix dataset. ## Citation If you use the Helix dataset for QI research or projects, please consider citing it using the appropriate citation format for each of the source datasets and the `bowie.py` script. ``` Marcus. 2023. Helix Dataset for Questioning and Instructing (QI). Helix. Self-published. https://huggingface.co/datasets/KaleidoSG/Helix ``` ## Acknowledgments We express our gratitude to the creators and maintainers of the Airoboros datasets and the RosettaCode dataset for their valuable contributions to this specialized dataset for Questioning and Instructing (QI) tasks.
提供机构:
KaleidoSG
原始信息汇总

Helix Dataset for Questioning and Instructing (QI)

描述

Helix 数据集是一个专门为提问和指导(QI)任务定制的数据集合。它通过合并所有 Airoboros 数据集并加入一个 RosettaCode 数据集创建,主要用于支持 QI 研究和应用。

数据集详情

  • 来源数据集: Airoboros 数据集(多个来源),RosettaCode 数据集
  • 合并脚本: 这些数据集的合并是通过 bowie.py 脚本完成的,该脚本包含在此仓库中。该脚本有助于格式化和整合数据集,以创建优化的 Helix 数据集用于 QI 任务。

用途

Helix 数据集特别适合从事 QI 任务的研究人员和开发者,包括:

  • 开发能够理解和响应自然语言查询和指令的 QI 系统。
  • 训练和评估用于 QI 应用的机器学习模型。
  • 基准测试 QI 算法和技术。
  • 研究自然语言理解和指令响应的交叉点。

许可证

请参考来源数据集的个别许可证以获取具体的许可信息。在使用 Helix 数据集时,请确保遵守各自的许可证。

引用

如果您在 QI 研究或项目中使用 Helix 数据集,请考虑使用适当的引用格式引用每个来源数据集和 bowie.py 脚本。

Marcus. 2023. Helix Dataset for Questioning and Instructing (QI). Helix. Self-published. https://huggingface.co/datasets/KaleidoSG/Helix

致谢

我们感谢 Airoboros 数据集和 RosettaCode 数据集的创建者和维护者,他们为这个专门用于提问和指导(QI)任务的数据集做出了宝贵的贡献。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作