X-Dance

Name: X-Dance
Creator: maas
Published: 2026-04-29 19:29:16
License: 暂无描述

魔搭社区2026-04-29 更新2025-12-06 收录

下载链接：

https://modelscope.cn/datasets/MCG-NJU/X-Dance

下载链接

链接失效反馈

官方服务：

资源简介：

<h2 align="center">SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation</h2> <a href="https://scholar.google.com/citations?hl=en&user=0lLB3fsAAAAJ">Jiaming Zhang</a> · <a href="https://dblp.org/pid/316/8117.html">Shengming Cao</a> · <a href="https://qianduoduolr.github.io/">Rui Li</a> · <a href="https://openreview.net/profile?id=~Xiaotong_Zhao1">Xiaotong Zhao</a> · <a href="https://scholar.google.com/citations?user=TSMchWcAAAAJ&hl=en&oi=ao">Yutao Cui</a> <a href="">Xinglin Hou</a> · <a href="https://mcg.nju.edu.cn/member/gswu/en/index.html">Gangshan Wu</a> · <a href="https://openreview.net/profile?id=~Haolan_Chen1">Haolan Chen</a> · <a href="https://scholar.google.com/citations?user=FHvejDIAAAAJ">Yu Xu</a> · <a href="https://scholar.google.com/citations?user=TSMchWcAAAAJ&hl=en&oi=ao">Limin Wang</a> · <a href="https://openreview.net/profile?id=~Kai_Ma4">Kai Ma</a> <a href="https://arxiv.org/abs/2511.19320"><img src='https://img.shields.io/badge/arXiv-2511.19320-red' alt='Paper PDF'></a> <a href='https://mcg-nju.github.io/steadydancer-web'><img src='https://img.shields.io/badge/Project-Page-blue' alt='Project Page'></a> <a href='https://huggingface.co/MCG-NJU/SteadyDancer-14B'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-yellow'></a> <a href='https://huggingface.co/datasets/MCG-NJU/X-Dance'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-X--Dance-green'></a> </a>Multimedia Computing Group, Nanjing University   |   </a>Platform and Content Group (PCG), Tencent This repository is the `test dataset` of paper "SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation", called **X-Dance**. SteadyDancer is a strong animation framework based on **Image-to-Video paradigm**, ensuring **robust first-frame preservation**. In contrast to prior *Reference-to-Video* approaches that often suffer from identity drift due to **spatio-temporal misalignments** common in real-world applications, SteadyDancer generates **high-fidelity and temporally coherent** human animations, outperforming existing methods in visual quality and control while **requiring significantly fewer training resources**. Standard benchmarks, such as TikTok and RealisDance, source both the reference image and pose sequence from the **same video**. This idealized setup fails to reflect the spatio-temporal misalignment challenges prevalent in real-world applications. To more robustly evaluate the model's generalization capabilities in such scenarios, we curated and introduced a new **different-source** evaluation dataset, **X-Dance**. We first collected 12 distinct driving videos, comprising 8 sequences of intricate, high-dynamic dance movements and 4 sequences of low-amplitude daily activities. These sequences are replete with non-ideal real-world factors, such as motion blur, severe occlusion, and drastic pose changes. Tailored to these motions, **we specifically curated a diverse set of reference images to simulate real-world misalignments**. This specially designed collection contains: (1) anime characters to introduce stylistic domain gaps; (2) half-body shots to represent compositional inconsistencies; (3) cross-gender or anime characters to simulate significant skeletal structural discrepancies; and (4) subjects in distinct postures to maximize the initial action gap. By systematically pairing these reference images with the 12 driving videos, we simulate two critical real-world challenges: (1) Spatial pose-structure inconsistency (e.g., an anime character driving a real-world pose); and (2) Temporal discontinuity, specifically the significant gap between the reference pose and the initial driving pose. ![X-Dance](assets/X-Dance.png) ## 📚 Citation If you find our paper or this codebase useful for your research, please cite us. ```BibTeX @misc{zhang2025steadydancer, title={SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation}, author={Jiaming Zhang and Shengming Cao and Rui Li and Xiaotong Zhao and Yutao Cui and Xinglin Hou and Gangshan Wu and Haolan Chen and Yu Xu and Limin Wang and Kai Ma}, year={2025}, eprint={2511.19320}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2511.19320}, } ```

<h2 align="center">SteadyDancer: 实现首帧保留的和谐连贯人体图像动画</h2><a href="https://scholar.google.com/citations?hl=en&user=0lLB3fsAAAAJ">张嘉明</a>·<a href="https://dblp.org/pid/316/8117.html">曹胜明</a>·<a href="https://qianduoduolr.github.io/">李睿</a>·<a href="https://openreview.net/profile?id=~Xiaotong_Zhao1">赵晓桐</a>·<a href="https://scholar.google.com/citations?user=TSMchWcAAAAJ&hl=en&oi=ao">崔玉涛</a> <a href="">侯兴林</a>·<a href="https://mcg.nju.edu.cn/member/gswu/en/index.html">吴冈山</a>·<a href="https://openreview.net/profile?id=~Haolan_Chen1">陈昊澜</a>·<a href="https://scholar.google.com/citations?user=FHvejDIAAAAJ">徐宇</a>·<a href="https://scholar.google.com/citations?user=TSMchWcAAAAJ&hl=en&oi=ao">王利民</a>·<a href="https://openreview.net/profile?id=~Kai_Ma4">马凯</a> <a href="https://arxiv.org/abs/2511.19320"><img src='https://img.shields.io/badge/arXiv-2511.19320-red' alt='论文PDF'></a><a href='https://mcg-nju.github.io/steadydancer-web'><img src='https://img.shields.io/badge/Project-Page-blue' alt='项目主页'></a><a href='https://huggingface.co/MCG-NJU/SteadyDancer-14B'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-%E6%A8%A1%E5%9E%8B-yellow'></a><a href='https://huggingface.co/datasets/MCG-NJU/X-Dance'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-X--Dance-%E6%95%B0%E6%8D%AE%E9%9B%86-green'></a> 南京大学多媒体计算组   |   腾讯平台与内容群（PCG） 本仓库为论文《SteadyDancer: 实现首帧保留的和谐连贯人体图像动画》的**测试数据集**，命名为**X-Dance**。 SteadyDancer是一款基于**图像转视频（Image-to-Video）范式**的高性能动画框架，可实现**鲁棒的首帧保留**。与此前主流的*参考图像转视频（Reference-to-Video）*方法不同，这类方法常因现实场景中普遍存在的**时空错位（spatio-temporal misalignments）**问题出现身份漂移；SteadyDancer可生成**高保真且时间连贯**的人体动画，在视觉质量与可控性上优于现有方法，同时所需训练资源大幅减少。现有标准基准测试（如TikTok、RealisDance）均从**同一视频**中提取参考图像与姿态序列，这种理想化设置无法反映现实场景中普遍存在的时空错位挑战。为更稳健地评估模型在这类场景下的泛化能力，我们构建并推出了全新的**异源（different-source）评估数据集X-Dance**。我们首先收集了12段独立的驱动视频，其中包含8段复杂高动态舞蹈动作序列与4段低幅度日常活动序列。这些序列包含诸多非理想现实场景因素，如运动模糊、严重遮挡与剧烈姿态变化。针对这些动作，**我们专门筛选了多样化的参考图像以模拟现实中的错位问题**。该定制化数据集包含：(1) 动漫角色，以引入风格域差异；(2) 半身取景画面，以体现构图不一致性；(3) 跨性别角色或动漫角色，以模拟显著的骨骼结构差异；(4) 姿态迥异的主体，以最大化初始动作间隙。通过将这些参考图像与12段驱动视频进行系统性配对，我们模拟了两类关键现实挑战：(1) 空间姿态结构不一致（例如用动漫角色驱动真实世界姿态）；(2) 时间不连续性，具体表现为参考姿态与初始驱动姿态间存在显著差距。 ![X-Dance](assets/X-Dance.png) ## 📚 引用如果您的研究中用到了本论文或本代码库，请引用我们。 BibTeX @misc{zhang2025steadydancer, title={SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation}, author={Jiaming Zhang and Shengming Cao and Rui Li and Xiaotong Zhao and Yutao Cui and Xinglin Hou and Gangshan Wu and Haolan Chen and Yu Xu and Limin Wang and Kai Ma}, year={2025}, eprint={2511.19320}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2511.19320}, }

提供机构：

maas

创建时间：

2025-12-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集