five

henry1477/pcbslm-static-v2-unsloth-vlm

收藏
Hugging Face2026-04-15 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/henry1477/pcbslm-static-v2-unsloth-vlm
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: other task_categories: - image-text-to-text - visual-question-answering language: - en pretty_name: PCBSLM static-v2 Unsloth VLM tags: - unsloth - gemma-4 - multimodal - pcb - electronics size_categories: - 1K<n<10K configs: - config_name: default data_files: - split: train path: data/vlm_train.jsonl - split: validation path: data/vlm_val.jsonl - split: test path: data/vlm_test.jsonl --- # PCBSLM static-v2 Unsloth VLM Portable multimodal Unsloth dataset for PCB layout/document-grounded training. The JSONL splits use Unsloth/Gemma-style chat messages: ```json { "messages": [ {"role": "user", "content": [ {"type": "image", "image": "assets/raw_docs/.../images/page.png"}, {"type": "text", "text": "instruction..."} ]}, {"role": "assistant", "content": [ {"type": "text", "text": "{...json answer...}"} ]} ] } ``` ## Files - `data/vlm_train.jsonl`: 4307 multimodal examples - `data/vlm_val.jsonl`: 289 multimodal examples - `data/vlm_test.jsonl`: 494 multimodal examples - `assets/raw_docs/`: source documents, rendered pages, and figure crops referenced by examples/metadata - `assets/board_images/`: board render images referenced by examples - `metadata_bundle.tar.gz`: document/evidence/rule metadata with repo-relative asset paths ## Unsloth Smoke Test This was verified locally with `unsloth/gemma-4-E2B-it` using: ```bash python scripts/smoke_train_unsloth_vlm.py \ data/vlm_train.jsonl \ --model-name unsloth/gemma-4-E2B-it \ --limit 4 \ --max-steps 2 \ --max-images 1 \ --max-seq-length 512 \ --resize 256 ``` If training outside a Hugging Face snapshot checkout, resolve image paths relative to the dataset root. HF repo: `henry1477/pcbslm-static-v2-unsloth-vlm`
提供机构:
henry1477
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作