five

JaveyZou/KingDesign

收藏
Hugging Face2026-04-17 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/JaveyZou/KingDesign
下载链接
链接失效反馈
官方服务:
资源简介:
--- size_categories: 1K<n<10K tags: - synthetic - datadesigner - king-design - code configs: - config_name: data data_files: data/*.parquet default: true license: mit task_categories: - question-answering language: - zh --- <div style="display: flex; justify-content: space-between; align-items: flex-end; width: 100%; margin-bottom: 1rem;"> <h1 style="flex: 1; margin: 0;">Kingdesign</h1> <sub style="white-space: nowrap;">Made with ❤️ using 🦥 Unsloth Studio</sub> </div> --- all was generated with Unsloth Recipe Studio. It contains 5,351 generated records. --- ## 🚀 Quick Start ```python from datasets import load_dataset # Load the main dataset dataset = load_dataset("JaveyZou/KingDesign", "data", split="train") df = dataset.to_pandas() ``` --- ## 📊 Dataset Summary - **📈 Records**: 5,351 - **📋 Columns**: 3 - **✅ Completion**: 89.2% (6,000 requested) --- ## 📋 Schema & Statistics | Column | Type | Column Type | Unique (%) | Null (%) | Details | |--------|------|-------------|------------|----------|---------| | `llm_structured_1` | `dict` | llm-structured | 5329 (99.6%) | 0 (0.0%) | Tokens: 402 out / 828 in | --- ## ⚙️ Generation Details Generated with 3 column configuration(s): - **llm-structured**: 1 column(s) - **seed-dataset**: 2 column(s) 📄 Full configuration available in [`builder_config.json`](builder_config.json) and detailed metadata in [`metadata.json`](metadata.json). --- ## 📚 Citation If you use Data Designer in your work, please cite the project as follows: ```bibtex @misc{nemo-data-designer, author = {The NeMo Data Designer Team, NVIDIA}, title = {NeMo Data Designer: A framework for generating synthetic data from scratch or based on your own seed data}, howpublished = {\url{https://github.com/NVIDIA-NeMo/DataDesigner}}, year = 2026, note = {GitHub Repository}, } ``` --- ## 💡 About NeMo Data Designer NeMo Data Designer is a general framework for generating high-quality synthetic data that goes beyond simple LLM prompting. It provides: - **Diverse data generation** using statistical samplers, LLMs, or existing seed datasets - **Relationship control** between fields with dependency-aware generation - **Quality validation** with built-in Python, SQL, and custom local and remote validators - **LLM-as-a-judge** scoring for quality assessment - **Fast iteration** with preview mode before full-scale generation For more information, visit: [https://github.com/NVIDIA-NeMo/DataDesigner](https://github.com/NVIDIA-NeMo/DataDesigner) (`pip install data-designer`)
提供机构:
JaveyZou
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作