glyphsoftware/opus-4.6-frontend-development

Name: glyphsoftware/opus-4.6-frontend-development
Creator: glyphsoftware
Published: 2026-03-24 04:49:45
License: 暂无描述

Hugging Face2026-03-24 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/glyphsoftware/opus-4.6-frontend-development

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 language: - en tags: - code - debugging - chain-of-thought - synthetic - ui - frontend - react - css pretty_name: CoT Code Debugging (Self-Instruct / Evolve-Instruct) size_categories: - n<1K --- # CoT Code Debugging Dataset Synthetic **code debugging** examples with **chain-of-thought (CoT)** reasoning and solutions, built with a three-stage pipeline: seed problem → evolved problem → detailed solve. Topics emphasize **frontend / UI engineering** (CSS, React, accessibility, layout, design systems, SSR/hydration, and related product UI issues). Each line in `dataset.jsonl` is one JSON object (JSONL format). ## Data fields | Field | Description | |--------|-------------| | `id` | 16-character hex id: SHA-256 of `evolved_problem`, truncated | | `topic` | Seed topic drawn from a fixed topic list (see pipeline) | | `seed_problem` | Initial debugging problem (short broken snippet + expected vs observed) | | `evolved_problem` | Rewritten/evolved problem (harder or more complex per strategy) | | `evolve_strategy` | Strategy applied during evolution (e.g. subtler bug, edge cases, concurrency) | | `cot_response` | Raw model output (includes `<reasoning>` / `<solution>` when formatted) | | `reasoning` | Parsed step-by-step analysis (from `<reasoning>` block, or full response if unparsed) | | `solution` | Parsed fix and explanation (from `<solution>` block) | | `model_seed` | Model id used for seed + evolve steps | | `model_cot` | Model id used for the CoT solution | | `timestamp` | ISO 8601 UTC time when the row was written | ## Generation pipeline 1. **Seed** — Sample a topic; generate a concise realistic debugging problem (broken snippet, expected vs observed, no solution). 2. **Evolve** — Rewrite the problem using a randomly chosen evolution strategy (harder / more subtle / combined bugs / production-style, etc.). 3. **CoT solve** — Model produces analysis and fix with tags `<reasoning>` … `</reasoning>` and `<solution>` … `</solution>`. Rows are skipped if quality checks fail (e.g. reasoning or evolved problem too short). ## Intended use - Supervised fine-tuning or distillation for **debugging**, **code reasoning**, or **CoT**-style assistants. - Research on synthetic data pipelines (self-instruct / evolve-instruct). ## Limitations - **Synthetic:** Content is LLM-generated; it may contain mistakes, unrealistic code, or inconsistent fixes. **Human review** is recommended before high-stakes use. - **Licensing:** Confirm compatibility with your use case and with the **underlying model** terms for the models listed in your export. - **Snapshot size:** The number of examples in a given `dataset.jsonl` depends on how long the generator was run (the reference pipeline targets a larger row count; your file may be a partial export). ## Loading (Python) ```python import json rows = [] with open("dataset.jsonl", encoding="utf-8") as f: for line in f: rows.append(json.loads(line)) ``` ## Citation If you use this dataset, cite the dataset repository and, where appropriate, the models named in each row’s `model_seed` and `model_cot` fields.

提供机构：

glyphsoftware

5,000+

优质数据集

54 个

任务类型

进入经典数据集