PreciseCam Dataset
收藏PreciseCam数据集概述
基本信息
- 数据集名称: PreciseCam
- 研究领域: 计算机视觉、文本到图像生成
- 发布会议: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025
- 论文链接: https://arxiv.org/abs/2501.12910
- 项目主页: https://graphics.unizar.es/projects/PreciseCam2024/
数据集内容
- 数据规模: 超过57,000张图像
- 数据类型: 图像及其对应的文本提示和真实相机参数
- 数据用途: 用于精确控制文本到图像生成中的相机参数
模型访问
- 模型名称: edurnebb/PreciseCam
- 模型平台: Hugging Face
- 模型链接: https://huggingface.co/edurnebb/PreciseCam
- 注意事项: 公开模型与论文中使用的模型有所不同,结果可能有所差异,但整体行为保持一致
安装与运行
-
环境配置: bash conda create -n precisecam --yes conda activate precisecam bash environment_setup.sh
-
依赖库: 定制化的Diffusers库
- 库链接: https://github.com/edurnebernal/diffusers-adapted
演示功能
-
演示工具: Gradio
-
运行命令: bash python demo.py
-
功能描述:
- 设置相机参数(Roll, Pitch, Vertical FOV, ξ)
- 生成透视场(PF-US)
- 输入文本提示生成最终图像
-
测试环境: NVIDIA GeForce RTX 4070 Ti SUPER (16 GB)
引用格式
-
PreciseCam: bibtex @article{bernal2025precisecam, title={PreciseCam: Precise Camera Control for Text-to-Image Generation}, author={Bernal-Berdun, Edurne and Serrano, Ana and Masia, Belen and Gadelha, Matheus and Hold-Geoffroy, Yannick and Sun, Xin and Gutierrez, Diego}, journal={arXiv preprint arXiv:2501.12910}, year={2025} }
-
Diffusers库: bibtex @misc{von-platen-etal-2022-diffusers, author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Dhruv Nair and Sayak Paul and William Berman and Yiyi Xu and Steven Liu and Thomas Wolf}, title = {Diffusers: State-of-the-art diffusion models}, year = {2022}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {url{https://github.com/huggingface/diffusers}} }




