five

byroneverson/shell-cmd-instruct

收藏
Hugging Face2024-01-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/byroneverson/shell-cmd-instruct
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation language: - en tags: - instruction-finetuning pretty_name: Shell Command Instruct --- # **Used to train models that interact directly with shells** Follow-up details of my process - MacOS terminal commands for now. This dataset is still in alpha stages and will be modified. - Contains 500 somewhat unique training examples so far. - GPT4 seems like a good candidate for generating more data, licensing would need to be addressed. - I fine-tuned Solar-10.7B-Instruct-v1.0 with this dataset using a slightly modified version of axolotl. Just a few epochs was enough to get it to output correctly. - I use oobabooga/text-generation-webui with a custom chat extension for inference. No sandbox is used, it is piped directly into MacOS bash because I'm reckless. C: - Currently working towards training an MoE (2x7B), multi-modal model (image/text) with this dataset. (BakLLaVA-1-7B + LLaVA-v1.5-7B) - Inference stages: 1. Send the instruction to the model, expect command. 2. Detect shell command and send to sand-boxed shell. 4. Shell respose should be sent as additional input to model. 5. The final model response should be sent to user from assistant. TODO: - Possible "os" column to specify which system the command should be used with, maybe separate datasets for each system type. ## **Sample prompt: (in series, depends on your specific model prompt)** ``` ### User: List files in 'Downloads' ### Command: ls ~/Downloads ``` ``` ### Shell: file1.pdf file2.txt file3.zip ### Assistant: Listing files in 'Downloads': file1.pdf file2.txt file3.zip ```
提供机构:
byroneverson
原始信息汇总

数据集概述

基本信息

  • 许可证: Apache-2.0
  • 任务类别: 文本生成
  • 语言: 英语
  • 标签: 指令微调
  • 易读名称: Shell Command Instruct

数据集描述

  • 用于训练直接与Shell交互的模型。
  • 目前包含500个独特的训练样本。
  • 适用于MacOS终端命令。
  • 数据集处于Alpha阶段,将持续更新和修改。

使用案例

  • 使用此数据集微调了Solar-10.7B-Instruct-v1.0模型。
  • 使用oobabooga/text-generation-webui进行推理,直接在MacOS bash中运行。
  • 正在开发一个多模态模型(图像/文本),使用此数据集进行训练。

推理流程

  1. 向模型发送指令,期望返回命令。
  2. 检测Shell命令并发送至沙盒Shell。
  3. Shell响应应作为额外输入发送给模型。
  4. 最终模型响应应由助手发送给用户。

示例

用户输入

List files in Downloads

命令输出

ls ~/Downloads

Shell输出

file1.pdf file2.txt file3.zip

助手输出

Listing files in Downloads: file1.pdf file2.txt file3.zip

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作