five

NOKHAB-Lab/LLM_4_HW-SW_Interface

收藏
Hugging Face2026-03-19 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/NOKHAB-Lab/LLM_4_HW-SW_Interface
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-generation language: - en tags: - code - embedded-systems - raspberry-pi - hardware-software-interface - c-programming - fine-tuning - llm pretty_name: HW/SW Interface LLM Dataset size_categories: - 10K<n<100K --- # HW/SW Interface LLM Dataset ## Overview A dataset of **~17,000 compiler-validated C programs** for ARM-based embedded platforms (Raspberry Pi 3B+, Zero W, 4B, 5), designed for fine-tuning LLMs on hardware-software interface generation tasks. Fine-tuned open-source models (6–7B parameters) trained on this dataset achieve **81–90% accuracy**, matching or exceeding GPT-4o (89.3%) and Gemini Flash 2.0 (87.2%). ## Dataset `data/training_set.jsonl` — **~17,000 instruction-response pairs**, fine-tuning ready. Combines two sources: - **~16,000** compiler-validated synthetic programs (Gemini Flash 2.0 + GCC-in-the-loop) - **~1,000** real-world programs collected from GitHub and educational resources Each entry: ```json {"prompt": "Write a C program for Raspberry Pi 4B to read a DHT22 sensor...", "completion": "#include <wiringPi.h>\n..."} ``` | Category | Count | % | |---|---|---| | Sensor integration | 8,559 | 50.5% | | Combined sensor-actuator | 5,582 | 33.0% | | Actuator control | 1,009 | 6.0% | | Real-world | 1,053 | 6.2% | | Other | 736 | 4.3% | Covers **400+ distinct sensor types** across diverse communication protocols (SPI, I²C, UART, 1-Wire) and Raspberry Pi hardware models. ## Replication Package Full replication package (pipelines, fine-tuning notebooks, evaluation scripts, results): 👉 [github.com/NOKHAB-Lab/LLM_4_HW-SW_Interface](https://github.com/NOKHAB-Lab/LLM_4_HW-SW_Interface) ## License [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
提供机构:
NOKHAB-Lab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作