five

internlm/EndoCoT-Data

收藏
Hugging Face2026-03-18 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/internlm/EndoCoT-Data
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: cc-by-nc-4.0 task_categories: - image-to-image datasets: - internlm/EndoCoT-Data base_model: - Qwen/Qwen-Image-Edit-2511 --- <p align="center"> <img src="fig/banner.svg" alt="EndoCoT" width="900"/> </p> <p align="center"> <a href="https://github.com/InternLM/EndoCoT"><img src="https://img.shields.io/github/stars/InternLM/EndoCoT?style=flat-square&logo=github&label=Stars&color=FFB300"></a> <a href="https://github.com/InternLM/EndoCoT/forks"><img src="https://img.shields.io/github/forks/InternLM/EndoCoT?style=flat-square&logo=github&label=Forks&color=2196F3"></a> <a href="https://github.com/InternLM/EndoCoT/issues"><img src="https://img.shields.io/github/issues/InternLM/EndoCoT?style=flat-square&logo=github&label=Issues&color=4CAF50"></a> <a href="https://github.com/InternLM/EndoCoT/blob/main/LICENSE"><img src="https://img.shields.io/github/license/InternLM/EndoCoT?style=flat-square&label=License&color=9C27B0"></a> <br> <a href="https://arxiv.org/abs/2603.12252"><img src="https://img.shields.io/badge/Paper-arXiv-B31B1B?style=flat-square"></a> <a href="https://internlm.github.io/EndoCoT/"><img src="https://img.shields.io/badge/Homepage-Project-blue?style=flat-square"></a> <a href="https://huggingface.co/internlm/EndoCoT"><img src="https://img.shields.io/badge/Model-HuggingFace-yellow?style=flat-square"></a> <a href="https://huggingface.co/datasets/internlm/EndoCoT-Data"><img src="https://img.shields.io/badge/Dataset-HuggingFace-orange?style=flat-square"></a> <br> <br> <img src="fig/teaser.jpg" alt="Teaser" width="100%" style="border-radius: 10px; box-shadow: 0 6px 20px rgba(0,0,0,0.2);"> </p> # EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models This repository contains the training data for **EndoCoT**, a novel framework that activates the reasoning potential of Multimodal Large Language Models (MLLMs) within diffusion frameworks through an iterative thought guidance module. - **Paper:** [EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models](https://arxiv.org/abs/2603.12252) - **Project Page:** [https://internlm.github.io/EndoCoT/](https://internlm.github.io/EndoCoT/) - **Repository:** [https://github.com/InternLM/EndoCoT](https://github.com/InternLM/EndoCoT) ## 🌟 Highlights - **EndoCoT** is a reasoning paradigm for diffusion models that enables step-by-step inference. - It outperforms conventional training methods on complex tasks like Maze, TSP, VSP, and Sudoku. - Provides transparent, intermediate reasoning trajectories. ## ⚡ Quick Start ### Setup environment ```bash git clone https://github.com/InternLM/EndoCoT cd EndoCoT conda create -n EndoCoT python=3.10 conda activate EndoCot pip install -r requirements.txt ``` ### Sample Usage (Inference) To test a single case using the codebase: ```bash cd test python test.py \ --task Maze \ --model_root /path/to/merged_ckpts \ --lora_path /path/to/your_lora_weight.safetensors \ --input_image ./data/sudoku_sample.png \ --output_dir ./outputs/sudoku_results ``` ### Training 1. Download the datasets & `metadata.csv` and ensure they are placed in the same directory. 2. Run the training scripts: ```bash cd DiffSynth-Studio bash add/Maze/stage1.sh python change_ckpt_prefix.py --src /path/to/the/Maze/save/dir/Maze_stage1 bash add/Maze/stage2.sh python change_ckpt_prefix.py --src /path/to/the/Maze/save/dir/Maze_stage2 ``` ## 📰 News - 🚀 [2026/3/12] We have released the EndoCoT [repository](https://github.com/InternLM/EndoCoT) and [ckpts](https://huggingface.co/internlm/EndoCoT). ## 📖 Citation ``` @article{dai2026endocot, title={EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models}, author={Dai, Xuanlang and Zhou, Yujie and Xing, Long and Bu, Jiazi and Wei, Xilin and Liu, Yuhong and Zhang, Beichen and Chen, Kai and Zang, Yuhang}, journal={arXiv preprint arXiv:2603.12252}, year={2026} } ``` ## ⚖️ License The code in the associated repository is licensed under the **MIT License**. The dataset is licensed under the **CC BY-NC 4.0 License**.
提供机构:
internlm
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作