Eyeline-Labs/Vista4D-Eval-Data

Name: Eyeline-Labs/Vista4D-Eval-Data
Creator: Eyeline-Labs
Published: 2026-04-24 01:56:45
License: 暂无描述

Hugging Face2026-04-24 更新2026-05-10 收录

下载链接：

https://hf-mirror.com/datasets/Eyeline-Labs/Vista4D-Eval-Data

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 language: - en --- # Vista4D: Video Reshooting with 4D Point Clouds (CVPR 2026 Highlight) – Evaluation Dataset [![Project Page](https://img.shields.io/badge/Project-Page-yellow?logo=data:image/svg%2Bxml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgc3Ryb2tlPSJ5ZWxsb3ciIHN0cm9rZS13aWR0aD0iMiIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIj48Y2lyY2xlIGN4PSIxMiIgY3k9IjEyIiByPSIxMCIvPjxsaW5lIHgxPSIyIiB5MT0iMTIiIHgyPSIyMiIgeTI9IjEyIi8+PHBhdGggZD0iTTEyIDJhMTUuMyAxNS4zIDAgMCAxIDQgMTAgMTUuMyAxNS4zIDAgMCAxLTQgMTAgMTUuMyAxNS4zIDAgMCAxLTQtMTAgMTUuMyAxNS4zIDAgMCAxIDQtMTB6Ii8+PC9zdmc+)](https://eyeline-labs.github.io/Vista4D) [![Paper](https://img.shields.io/badge/Paper-arXiv-b31b1b?logo=arxiv&logoColor=red)](https://arxiv.org/abs/2604.21915) [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Vista4D-blue)](https://huggingface.co/Eyeline-Labs/Vista4D) [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Eval%20Data-blue)](https://huggingface.co/datasets/Eyeline-Labs/Vista4D-Eval-Data) [Kuan Heng Lin](https://kuanhenglin.github.io)1,3&lowast;, [Zhizheng Liu](https://bosmallear.github.io)1,4&lowast;, [Pablo Salamanca](https://pablosalaman.ca)1,2, [Yash Kant](https://yashkant.github.io)1,2, [Ryan Burgert](https://ryanndagreat.github.io)1,2,5&lowast;, [Yuancheng Xu](https://yuancheng-xu.github.io)1,2, [Koichi Namekata](https://kmcode1.github.io)1,2,6&lowast;, [Yiwei Zhao](https://zhaoyw007.github.io)2, [Bolei Zhou](https://boleizhou.github.io)4, [Micah Goldblum](https://goldblum.github.io)3, [Paul Debevec](https://www.pauldebevec.com)1,2, [Ning Yu](https://ningyu1991.github.io)1,2 1Eyeline Labs, 2Netflix, 3Columbia University, 4UCLA, 5Stony Brook University, 6University of Oxford &lowast;*Work done during an internship at Eyeline Labs* <div align="center"> <video controls autoplay muted style="width: 100%;" src="https://media.githubusercontent.com/media/Eyeline-Labs/Vista4D/website/media/vista4d.mp4"></video> </div> **Vista4D** is a *video reshooting* framework which synthesizes the dynamic scene represented by an input source video from novel camera trajectories and viewpoints. We bridge the distribution shift between training and inference for point-cloud-grounded video reshooting, as Vista4D is robust to point cloud artifacts from imprecise 4D reconstruction of real-world videos by training on noisy, reconstructed multiview videos. Our 4D point cloud with temporally-persistent static points also explicitly preserves scene content and improved camera control. Vista4D generalizes to real-world applications such as dynamic scene expansion (casual video capture of scene as background reference), 4D scene recomposition (point cloud editing), and long video inference with memory. This is the Hugging Face repository containing our evaluation dataset. We provide 110 video-camera pairs to evaluate Vista4D. We select 13 videos from [DAVIS](https://davischallenge.org/) and 38 videos from [Pexels](https://www.pexels.com/). We use [Pi3](https://yyfz.github.io/pi3/) for 4D reconstruction and [Grounded SAM 2](https://github.com/IDEA-Research/Grounded-SAM-2) to do dynamic pixel segmentation. Then, for each video, we hand-design two to three target cameras for each video using our camera UI. To download the dataset, from the root directory of the project, run ```bash huggingface-cli download Eyeline-Labs/Vista4D-Eval-Data --repo-type dataset --local-dir eval_data ``` to download the Vista4D evaluation dataset into `./eval_data/` and then run ```bash tar -xvf eval_data/eval_data.tar -C eval_data/ ``` to extract the contents. It should have the following structure: ``` eval_data/ metadata.csv recon_and_seg/ # 4D reconstruction and dynamic mask segmentation avocado-slice/ # There should be 51 total videos cameras.npz # Source intrinsics and extrinsics video.mp4 depths/ 00000.exr ... dynamic_mask/ 00000.png ... sky_mask/ # Sky segmentation (to set them to a large depth) 00000.png ... [video_name]/ ... ... cameras/ avocado-slice/ # Two to three target cameras per video close-crane-above.npz left-front-zoom.npz [video_name]/ [camera_name].npz ... ... ``` `metadata.csv` contains the following information: - `name`: Name of video-camera pair, in the format `[video]_[camera]` - `video`: Name of source video, the 4D reconstruction and segmentation can be found in `eval_data/recon_and_seg/[video]/` - `camera`: Name of camera, corresponds to a `video`, can be found in `eval_data/cameras/[video]/[camera].npz` - `seed`: Randomly-generated fixed seed for evaluation - `prompt`: Prompt for the video-camera pair, usually just the prompt of the source video - `dynamic`: Dynamic keywords used to obtain the segmentation map - `do_sky_seg`: Whether the video contains sky (and thus we need to segment it separately) - `source`: Source of the video, `davis` or `pexels` - `video_id`: For videos from `pexels` only, original ID of the video on Pexels, full link is `https://www.pexels.com/video/[video_id]` Instructions on how to use this dataset, model weights, more results, and paper can be found on our [project page](https://eyeline-labs.github.io/Vista4D/) and [GitHub repository](https://github.com/Eyeline-Labs/Vista4D/tree/main).

提供机构：

Eyeline-Labs

5,000+

优质数据集

54 个

任务类型

进入经典数据集