genfusion

Name: genfusion
Creator: maas
Published: 2025-11-18 21:41:35
License: 暂无描述

魔搭社区2025-11-18 更新2025-11-15 收录

下载链接：

https://modelscope.cn/datasets/xycxiyi/genfusion

下载链接

链接失效反馈

官方服务：

资源简介：

# GenFusion: Closing the Loop between Reconstruction and Generation via Videos [Project page](https://genfusion.sibowu.com) | [Paper](https://arxiv.org/abs/2503.21219) | [Data](https://huggingface.co/datasets/Inception3D/GenFusion_Training_Data) <br> ![Teaser image](assets/pipeline.png) This repo contains the official implementation for the paper "**GenFusion: Closing the loop between Reconstruction and Generation via Videos**". ## Installation ```bash conda env create --file environment.yml conda activate genfusion cd Reconstruction CC=gcc-9 CXX=g++-9 pip install submodules/simple-knn CC=gcc-9 CXX=g++-9 pip install submodules/diff-surfel-rasterization ``` ## Generation Model Training The generation model is finetuned from [DynamiCrafter](https://github.com/Doubiiu/DynamiCrafter). Step 1. Download the DL3DV Renderings dataset Step 2. Download pretrained models via Hugging Face, and put the model.ckpt with the required resolution in checkpoints/dynamicrafter_512_v1/model.ckpt. Step 3. Run the following command to start training on a resolution of 512x320. ```bash cd ./GenerationModel sh configs/training_video_v1.0/run_interp.sh ``` Step 4. Finetuned the model on a higher resolution of 960x512. ```bash sh configs/training_960_v1.0/run_interp.sh ``` If you want to use the model for generation inference, you can download our pre-trained model from [here](https://huggingface.co/Sibo2rr/GenFusion-GenerationModel). ## Reconstruction If your skip the generation model training, you can download our pre-trained model from [here](https://huggingface.co/Sibo2rr/GenFusion-GenerationModel) and put it in the `./diffusion_ckpt` folder ### Masked 3D Reconstruction Step 1. The testing scenes in our paper is selected from DL3DV Benchmarkand DL3DV dataset. Download scenes from [Huggingface](https://huggingface.co/datasets/Inception3D/GenFusion_DL3DV_24Benchmark) and put them in the `./data` folder. Step 2. To get the quantitative results in our paper, you can run the following command. ```bash cd ./Reconstruction python genfusion_scripts/batch_ours.py [gpu_ids] ``` To run a masked 3D reconstruction on your own scene, you can use the following command. ```bash cd ./Reconstruction python train.py --data_dir [data_dir] \ -m [output_dir] \ --iterations 7_000 \ --test_iterations 7_000 \ --diffusion_ckpt ./diffusion_ckpt/epoch=59-step=34000.ckpt \ --diffusion_config ./generation_infer.yaml \ --num_frames 16 \ --outpaint_type crop \ --start_diffusion_iter 3000 \ --sparse_view 0 \ --downsample_factor 2 \ --diffusion_resize_width 960 \ --diffusion_resize_height 512 \ --diffusion_crop_width 960 \ --diffusion_crop_height 512 \ --patch_size [crop_width] [crop_height] \ # we recommend to use half of the image resolution --opacity_reset_interval 9000 \ --lambda_dist 0.0 \ --lambda_reg 0.5 \ --lambda_dssim 0.8 \ --densify_from_iter 1000 \ --unconditional_guidance_scale 3.2 \ --repair ``` ### Sparse view Reconstruction Step 1. Download the Mip-NeRF data and train/test split from [ReconFusion](https://drive.google.com/drive/folders/10oT2_OQ9Sjh5wlfJQoGx2y7ZKYwpgNg5) Step 2. Run the following command to start training. ```bash cd ./Reconstruction python genfusion_scripts/batch_sparse.py [gpu_ids] ``` To run a sparse view reconstruction on your own scene, a `train_test_split_{sparse_view}.json` file is required in the following format: ```json { "test_ids": [ 0, 8, 12, ... ], "train_ids": [ 1, 2, 3, ... ] } ``` Then run the following command: ```bash python train.py \ --data_dir [data_dir] -m [output_dir] \ --iterations 7000 \ --test_iterations 7_000 \ --diffusion_ckpt ./diffusion_ckpt/epoch=59-step=34000.ckpt \ --diffusion_config ./generation_infer.yaml \ --num_frames 16 \ --outpaint_type sparse \ --start_diffusion_iter 1000 \ --sparse_view [num_of_sparse_views] \ --diffusion_resize_width 960 \ --diffusion_resize_height 512 \ --diffusion_crop_width 960 \ --diffusion_crop_height 512 \ --mono_depth \ --repair \ --densify_from_iter 100 \ --diffusion_until 7000 \ --diffusion_every 1000 \ --densify_until_iter 5000 \ --densification_interval 500 \ --opacity_reset_interval 3100 \ --lambda_dist 10.0 \ --lambda_dssim 0.5 \ --lambda_reg 1.0 \ --unconditional_guidance_scale 3.2 ``` ### Scene Completion We provide two options to reconstruct the unseen area of the scene. - Option 1: Pre-defined camera path using [RemoteViewer](https://github.com/hwanhuh/2D-GS-Viser-Viewer), save the trajectory as a json file. Use counter scene in Mip-NeRF360 as an example: ```bash python train.py \ --data_dir [data_dir] \ -m output_ours/counter_completion \ --test_iterations 7_000 \ --diffusion_ckpt ./diffusion_ckpt/epoch=59-step=34000.ckpt \ --diffusion_config ./generation_infer.yaml \ --num_frames 16 \ --outpaint_type rotation \ --add_indices 7 15 \ --depth_loss \ --iterations 26000 \ --diffusion_resize_width 960 \ --diffusion_resize_height 512 \ --diffusion_crop_width 960 \ --diffusion_crop_height 512 \ --repair \ --port 6678 \ --densify_from_iter 500 \ --densify_until_iter 12000 \ --diffusion_until 30000 \ --start_diffusion_iter 5000 \ --diffusion_every 4000 \ --opacity_reset_interval 15000 \ --unconditional_guidance_scale 2.2 \ --start_dist_iter 3000 \ --camera_path_file [trajectory file] ``` - Option 2: Use path paramters in the script to determine the shape of sampling path. ```bash python train.py \ --data_dir [data_dir] \ -m [output_dir] \ --test_iterations 7_000 \ --diffusion_ckpt ./diffusion_ckpt/epoch=59-step=34000.ckpt \ --diffusion_config ./generation_infer.yaml \ --num_frames 16 \ --outpaint_type rotation \ --add_indices 7 15 \ --depth_loss \ --iterations 26000 \ --diffusion_resize_width 960 \ --diffusion_resize_height 512 \ --diffusion_crop_width 960 \ --diffusion_crop_height 512 \ --port 6691 \ --densify_from_iter 500 \ --densify_until_iter 12000 \ --diffusion_until 30000 \ --start_diffusion_iter 5000 \ --diffusion_every 4000 \ --opacity_reset_interval 15000 \ --repair \ # define the following parameters for your own scene --path_scale 1.0 \ --rotation_angle 90 \ --position_z_offset 0.5 \ --distance -0.2 \ --unconditional_guidance_scale 2.2 \ --start_dist_iter 3000 ``` ## Citation If you find this work useful in your research, please consider citing: ```bibtex @inproceedings{Wu2025GenFusion, author = {Sibo Wu and Congrong Xu and Binbin Huang and Geiger Andreas and Anpei Chen}, title = {GenFusion: Closing the Loop between Reconstruction and Generation via Videos}, booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2025} } ``` ## Acknowledgements This project is built upon [2DGS](https://github.com/hbb1/2d-gaussian-splatting) and [DynamiCrafter](https://github.com/Doubiiu/DynamiCrafter). The dataset processing is based on [gsplat](https://github.com/nerfstudio-project/gsplat/tree/main/gsplat). We thank all the authors for their great repos. Special thanks to [LiangrunDa](https://github.com/LiangrunDa) for his help on the project website and part of the code.

# GenFusion：通过视频实现重建与生成的闭环 [项目页面](https://genfusion.sibowu.com) | [论文](https://arxiv.org/abs/2503.21219) | [数据集](https://huggingface.co/datasets/Inception3D/GenFusion_Training_Data) <br> ![示例示意图](assets/pipeline.png) 本仓库包含论文《GenFusion：通过视频实现重建与生成的闭环》的官方实现代码。 ## 安装环境 bash conda env create --file environment.yml conda activate genfusion cd Reconstruction CC=gcc-9 CXX=g++-9 pip install submodules/simple-knn CC=gcc-9 CXX=g++-9 pip install submodules/diff-surfel-rasterization ## 生成模型训练本生成模型基于[DynamiCrafter](https://github.com/Doubiiu/DynamiCrafter)进行微调。步骤1：下载DL3DV渲染数据集步骤2：通过Hugging Face下载预训练模型，并将对应分辨率的model.ckpt放置于checkpoints/dynamicrafter_512_v1/model.ckpt路径下。步骤3：运行以下命令，以512x320分辨率启动训练： bash cd ./GenerationModel sh configs/training_video_v1.0/run_interp.sh 步骤4：以960x512更高分辨率对模型进行微调： bash sh configs/training_960_v1.0/run_interp.sh 若需使用该模型执行生成推理，可从[此处](https://huggingface.co/Sibo2rr/GenFusion-GenerationModel)下载我们的预训练模型。 ## 重建模块若您跳过生成模型训练环节，可从上述链接下载预训练模型，并放置于`./diffusion_ckpt`文件夹中。 ### 遮罩式3D重建步骤1：本文中的测试场景选自DL3DV基准测试集与DL3DV数据集，请从[Huggingface](https://huggingface.co/datasets/Inception3D/GenFusion_DL3DV_24Benchmark)下载场景文件，并放置于`./data`文件夹中。步骤2：若需复现论文中的定量结果，可运行以下命令： bash cd ./Reconstruction python genfusion_scripts/batch_ours.py [gpu_ids] 若需在自定义场景上执行遮罩式3D重建，可使用以下命令： bash cd ./Reconstruction python train.py --data_dir [data_dir] -m [output_dir] --iterations 7_000 --test_iterations 7_000 --diffusion_ckpt ./diffusion_ckpt/epoch=59-step=34000.ckpt --diffusion_config ./generation_infer.yaml --num_frames 16 --outpaint_type crop --start_diffusion_iter 3000 --sparse_view 0 --downsample_factor 2 --diffusion_resize_width 960 --diffusion_resize_height 512 --diffusion_crop_width 960 --diffusion_crop_height 512 --patch_size [crop_width] [crop_height] # 建议使用图像分辨率的一半 --opacity_reset_interval 9000 --lambda_dist 0.0 --lambda_reg 0.5 --lambda_dssim 0.8 --densify_from_iter 1000 --unconditional_guidance_scale 3.2 --repair ### 稀疏视角重建步骤1：从[ReconFusion](https://drive.google.com/drive/folders/10oT2_OQ9Sjh5wlfJQoGx2y7ZKYwpgNg5)下载Mip-NeRF数据集与训练/测试划分集。步骤2：运行以下命令启动训练： bash cd ./Reconstruction python genfusion_scripts/batch_sparse.py [gpu_ids] 若需在自定义场景上执行稀疏视角重建，需提供格式如下的`train_test_split_{sparse_view}.json`文件： json { "test_ids": [ 0, 8, 12, ... ], "train_ids": [ 1, 2, 3, ... ] } 随后运行以下命令： bash python train.py --data_dir [data_dir] -m [output_dir] --iterations 7000 --test_iterations 7_000 --diffusion_ckpt ./diffusion_ckpt/epoch=59-step=34000.ckpt --diffusion_config ./generation_infer.yaml --num_frames 16 --outpaint_type sparse --start_diffusion_iter 1000 --sparse_view [num_of_sparse_views] --diffusion_resize_width 960 --diffusion_resize_height 512 --diffusion_crop_width 960 --diffusion_crop_height 512 --mono_depth --repair --densify_from_iter 100 --diffusion_until 7000 --diffusion_every 1000 --densify_until_iter 5000 --densification_interval 500 --opacity_reset_interval 3100 --lambda_dist 10.0 --lambda_dssim 0.5 --lambda_reg 1.0 --unconditional_guidance_scale 3.2 ### 场景补全我们提供两种方案以重建场景的未观测区域。 - 方案1：使用[RemoteViewer](https://github.com/hwanhuh/2D-GS-Viser-Viewer)定义相机路径，并将轨迹保存为json文件，以Mip-NeRF360中的counter场景为例： bash python train.py --data_dir [data_dir] -m output_ours/counter_completion --test_iterations 7_000 --diffusion_ckpt ./diffusion_ckpt/epoch=59-step=34000.ckpt --diffusion_config ./generation_infer.yaml --num_frames 16 --outpaint_type rotation --add_indices 7 15 --depth_loss --iterations 26000 --diffusion_resize_width 960 --diffusion_resize_height 512 --diffusion_crop_width 960 --diffusion_crop_height 512 --repair --port 6678 --densify_from_iter 500 --densify_until_iter 12000 --diffusion_until 30000 --start_diffusion_iter 5000 --diffusion_every 4000 --opacity_reset_interval 15000 --unconditional_guidance_scale 2.2 --start_dist_iter 3000 --camera_path_file [trajectory file] - 方案2：通过脚本中的路径参数确定采样路径的形状： bash python train.py --data_dir [data_dir] -m [output_dir] --test_iterations 7_000 --diffusion_ckpt ./diffusion_ckpt/epoch=59-step=34000.ckpt --diffusion_config ./generation_infer.yaml --num_frames 16 --outpaint_type rotation --add_indices 7 15 --depth_loss --iterations 26000 --diffusion_resize_width 960 --diffusion_resize_height 512 --diffusion_crop_width 960 --diffusion_crop_height 512 --port 6691 --densify_from_iter 500 --densify_until_iter 12000 --diffusion_until 30000 --start_diffusion_iter 5000 --diffusion_every 4000 --opacity_reset_interval 15000 --repair # 为自定义场景配置以下参数 --path_scale 1.0 --rotation_angle 90 --position_z_offset 0.5 --distance -0.2 --unconditional_guidance_scale 2.2 --start_dist_iter 3000 ## 引用若本工作对您的研究有所助益，请引用如下文献： bibtex @inproceedings{Wu2025GenFusion, author = {Sibo Wu and Congrong Xu and Binbin Huang and Geiger Andreas and Anpei Chen}, title = {GenFusion: Closing the Loop between Reconstruction and Generation via Videos}, booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2025} } ## 致谢本项目基于[2DGS](https://github.com/hbb1/2d-gaussian-splatting)与[DynamiCrafter](https://github.com/Doubiiu/DynamiCrafter)开发，数据集处理基于[gsplat](https://github.com/nerfstudio-project/gsplat/tree/main/gsplat)实现，谨向上述开源仓库的作者致以诚挚谢意。特别感谢[LiangrunDa](https://github.com/LiangrunDa)为本项目网站与部分代码提供的帮助。

提供机构：

maas

创建时间：

2025-11-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集