five

Hayme/extrated-try

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Hayme/extrated-try
下载链接
链接失效反馈
官方服务:
资源简介:
--- size_categories: n<1K tags: - synthetic - datadesigner configs: - config_name: data data_files: data/*.parquet default: true --- <div style="display: flex; justify-content: space-between; align-items: flex-end; width: 100%; margin-bottom: 1rem;"> <h1 style="flex: 1; margin: 0;">Extrated-Try</h1> <sub style="white-space: nowrap;">Made with ❤️ using 🦥 Unsloth Studio</sub> </div> --- extracted-dataset was generated with Unsloth Recipe Studio. It contains 2 generated records. --- ## 🚀 Quick Start ```python from datasets import load_dataset # Load the main dataset dataset = load_dataset("Hayme/extrated-try", "data", split="train") df = dataset.to_pandas() ``` --- ## 📊 Dataset Summary - **📈 Records**: 2 - **📋 Columns**: 3 --- ## 📋 Schema & Statistics | Column | Type | Column Type | Unique (%) | Null (%) | Details | |--------|------|-------------|------------|----------|---------| | `llm_text_1` | `string` | llm-text | 2 (100.0%) | 0 (0.0%) | Tokens: 2337 out / 1446 in | --- ## ⚙️ Generation Details Generated with 2 column configuration(s): - **llm-text**: 1 column(s) - **seed-dataset**: 1 column(s) 📄 Full configuration available in [`builder_config.json`](builder_config.json) and detailed metadata in [`metadata.json`](metadata.json). --- ## 📚 Citation If you use Data Designer in your work, please cite the project as follows: ```bibtex @misc{nemo-data-designer, author = {The NeMo Data Designer Team, NVIDIA}, title = {NeMo Data Designer: A framework for generating synthetic data from scratch or based on your own seed data}, howpublished = {\url{https://github.com/NVIDIA-NeMo/DataDesigner}}, year = 2026, note = {GitHub Repository}, } ``` --- ## 💡 About NeMo Data Designer NeMo Data Designer is a general framework for generating high-quality synthetic data that goes beyond simple LLM prompting. It provides: - **Diverse data generation** using statistical samplers, LLMs, or existing seed datasets - **Relationship control** between fields with dependency-aware generation - **Quality validation** with built-in Python, SQL, and custom local and remote validators - **LLM-as-a-judge** scoring for quality assessment - **Fast iteration** with preview mode before full-scale generation For more information, visit: [https://github.com/NVIDIA-NeMo/DataDesigner](https://github.com/NVIDIA-NeMo/DataDesigner) (`pip install data-designer`)
提供机构:
Hayme
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作