five

y0sif/Arcwright-Leptos

收藏
Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/y0sif/Arcwright-Leptos
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 language: - en tags: - rust - code - instruction-tuning - leptos - chatml size_categories: - 1K<n<10K task_categories: - text-generation --- # Arcwright-Leptos An instruction-tuning dataset for the **[Leptos](https://github.com/leptos-rs/leptos)** Rust crate, built for the [Arcwright](https://huggingface.co/y0sif/arcwright-E4B-v1) fine-tuned model. Leptos is a reactive web UI framework for Rust with fine-grained reactivity, server-side rendering, and a component model inspired by modern frontend frameworks. ## Dataset Summary - **2,034 instruction-response pairs** covering reactive signals, components, `view!` macro, server functions, ssr, hydration, routing, and resource management - Generated from real source code using the **OSS-Instruct** methodology via Claude Code sub-agents - Validated for structural correctness and deduplicated using MinHash (Jaccard threshold 0.7) - Format: **ChatML** (messages array with system/user/assistant roles) ## Category Distribution | Category | Count | % | |----------|-------|---| | Code Generation | 711 | 34% | | Code Explanation | 364 | 17% | | Api Usage | 320 | 15% | | Bug Detection | 242 | 11% | | Refactoring | 211 | 10% | | Test Generation | 198 | 9% | ## Format Each example is a JSON object with a `messages` array: ```json { "messages": [ {"role": "system", "content": "You are an expert Rust programmer specializing in the leptos crate and modern Rust development patterns."}, {"role": "user", "content": "Write a Leptos component that displays a counter with increment and decrement buttons using reactive signals."}, {"role": "assistant", "content": "..."} ], "category": "code_generation", "crate": "leptos" } ``` ## Usage ```python from datasets import load_dataset dataset = load_dataset("y0sif/Arcwright-Leptos") print(dataset["train"][0]["messages"]) ``` ## Part of Arcwright This dataset is one of three crate-specific datasets used to train [Arcwright-E4B-v1](https://huggingface.co/y0sif/arcwright-E4B-v1): | Dataset | Crate | Pairs | |---------|-------|-------| | **[Arcwright-Leptos](https://huggingface.co/datasets/y0sif/Arcwright-Leptos)** | Leptos | 2,046 | | **[Arcwright-Axum](https://huggingface.co/datasets/y0sif/Arcwright-Axum)** | Axum | 741 | | **[Arcwright-Rig](https://huggingface.co/datasets/y0sif/Arcwright-Rig)** | Rig | 697 | ## Source All instruction pairs were generated from source code in the [Leptos repository](https://github.com/leptos-rs/leptos). Code was chunked using tree-sitter into meaningful units (functions, impl blocks, modules), then used as seed material for instruction generation. ## License Apache 2.0

许可证:Apache-2.0 语言:英语 标签:Rust、代码、指令微调(instruction-tuning)、Leptos、ChatML 规模类别:1K < n < 10K 任务类别:文本生成 # Arcwright-Leptos 本数据集专为**Leptos** Rust库打造的指令微调数据集,用于训练[Arcwright](https://huggingface.co/y0sif/arcwright-E4B-v1)微调模型。 Leptos是一款面向Rust的响应式Web UI框架,具备细粒度响应式能力、服务端渲染(Server-Side Rendering, SSR)特性,且采用借鉴现代前端框架的组件模型。 ## 数据集概览 - **2034条指令-响应对**,覆盖响应式信号(reactive signals)、组件、`view!`宏、服务端函数、SSR、水合(Hydration)、路由(routing)与资源管理(resource management) - 通过Claude Code智能体基于OSS-Instruct方法从真实源代码生成 - 采用雅卡尔(Jaccard)阈值为0.7的MinHash算法进行结构正确性验证与去重 - 格式:**ChatML**(包含system/user/assistant角色的消息数组) ## 类别分布 | 类别 | 数量 | 占比 | |----------|-------|---| | 代码生成 | 711 | 34% | | 代码解释 | 364 | 17% | | API使用 | 320 | 15% | | 漏洞检测 | 242 | 11% | | 代码重构 | 211 | 10% | | 测试用例生成 | 198 | 9% | ## 数据格式 每条示例为包含`messages`数组的JSON对象: json { "messages": [ {"role": "system", "content": "你是一名专注于Leptos库与现代Rust开发范式的资深Rust程序员。"}, {"role": "user", "content": "编写一个使用响应式信号实现递增、递减按钮的Leptos计数器组件。"}, {"role": "assistant", "content": "..."} ], "category": "code_generation", "crate": "leptos" } ## 使用方法 python from datasets import load_dataset dataset = load_dataset("y0sif/Arcwright-Leptos") print(dataset["train"][0]["messages"]) ## 隶属于Arcwright项目 本数据集是用于训练[Arcwright-E4B-v1](https://huggingface.co/y0sif/arcwright-E4B-v1)的三个特定库数据集之一: | 数据集 | 关联库 | 指令-响应对数量 | |---------|-------|-------| | **Arcwright-Leptos** | Leptos | 2046 | | **Arcwright-Axum** | Axum | 741 | | **Arcwright-Rig** | Rig | 697 | ## 数据来源 所有指令对均源自[Leptos官方仓库](https://github.com/leptos-rs/leptos)的源代码:先通过Tree-sitter将代码分块为函数、实现块、模块等有意义的单元,再将其作为种子素材生成指令对。 ## 许可证 Apache 2.0
提供机构:
y0sif
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作