five

devinxx/nl2bash-combined

收藏
Hugging Face2026-04-05 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/devinxx/nl2bash-combined
下载链接
链接失效反馈
官方服务:
资源简介:
# nl2bash-combined A merged and reformatted dataset for natural language to bash command generation, prepared for fine-tuning instruction-following language models. ## Sources | Dataset | Split | Rows | |---|---|---| | [jiacheng-ye/nl2bash](https://huggingface.co/datasets/jiacheng-ye/nl2bash) | train | 8,090 | | [jiacheng-ye/nl2bash](https://huggingface.co/datasets/jiacheng-ye/nl2bash) | validation | 609 | | [jiacheng-ye/nl2bash](https://huggingface.co/datasets/jiacheng-ye/nl2bash) | test | 606 | | [AnishJoshi/nl2bash-custom](https://huggingface.co/datasets/AnishJoshi/nl2bash-custom) | train | 19,658 | | [AnishJoshi/nl2bash-custom](https://huggingface.co/datasets/AnishJoshi/nl2bash-custom) | validation | 2,457 | | [AnishJoshi/nl2bash-custom](https://huggingface.co/datasets/AnishJoshi/nl2bash-custom) | test | 2,458 | **Total: 33,878 examples** — train: 27,748 / valid: 3,066 / test: 3,064 ## Format Each example is a chat messages list compatible with `mlx-lm` and the Llama/ChatML chat template: ```json { "messages": [ {"role": "system", "content": "You are a bash expert. Convert the user's natural language description into a single bash command. Output only the bash command, no explanation."}, {"role": "user", "content": "Find all python files modified in the last 7 days"}, {"role": "assistant", "content": "find . -name '*.py' -mtime -7"} ] } ``` ## Usage ```python from datasets import load_dataset ds = load_dataset("devinxx/nl2bash-combined") print(ds["train"][0]) ``` ### With mlx-lm fine-tuning Point your `lora_config.yaml` at this dataset: ```yaml model: mlx-community/Llama-3.2-1B-Instruct-4bit data: devinxx/nl2bash-combined ``` ## Fine-tuned Model This dataset was used to train [devinxx/Llama-3.2-1B-nl2bash](https://huggingface.co/devinxx/Llama-3.2-1B-nl2bash) — a 1B parameter model fine-tuned for natural language to bash command generation using LoRA via `mlx-lm`. ## License Derived from the original nl2bash corpus by Lin et al. (LREC 2018). See original repositories for license details.
提供机构:
devinxx
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作