One-to-All-sub
收藏魔搭社区2026-01-07 更新2026-01-10 收录
下载链接:
https://modelscope.cn/datasets/MochunniaN1/One-to-All-sub
下载链接
链接失效反馈官方服务:
资源简介:
# One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
This repository contains the sample training data and benchmarks associated with the paper [One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer](https://huggingface.co/papers/2511.22940).
The paper presents a unified framework for high-fidelity character animation and image pose transfer for references with arbitrary layouts, addressing spatial misalignment and partially visible references through innovative techniques.
- 🌐 [Project Page](https://ssj9596.github.io/one-to-all-animation-project/)
- 💻 [GitHub Repository](https://github.com/ssj9596/One-to-All-Animation)
## 🌟 Highlights
We provide a **complete and reproducible** training and evaluation pipeline:
- ✅ **Full Training Code**: Three-stage progressive training from scratch
- ✅ **Complete Benchmarks**: Reproduction code and pre-trained checkpoints
- ✅ **Flexible Training Codebase**: Multi-resolution, multi-aspect-ratio, and multi-frame training codebase
- ✅ **Datasets**: Pre-processed open-source datasets + self-collected cartoon data
## ☕️ Quick Inference (Sample Usage)
To perform quick inference with the models, follow these steps from the [GitHub repository](https://github.com/ssj9596/One-to-All-Animation):
### 🔧 Dependencies and Installation
1. Clone Repo
```bash
git clone https://github.com/ssj9596/One-to-All-Animation.git
cd One-to-All-Animation
```
2. Create Conda Environment and Install Dependencies
```bash
# create new conda env
conda create -n one-to-all python=3.12
conda activate one-to-all
# install pytorch
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
# or
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 -i https://mirrors.aliyun.com/pypi/simple/
# install python dependencies
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
# (Recommended) install flash attention 3 (or 2) from source:
# https://github.com/Dao-AILab/flash-attention
```
## 🎬 Training from scratch
>💡 **Data Collection Required**: We find current open-source datasets are not sufficient for training from scratch. We strongly recommend collecting *at least 3,000 additional high-quality video samples* for better results.
We divide the training process into several steps to help you reproduce our results from scratch (using 1.3B as an example).
1. Download Pretrained Models
Download the base model from HuggingFace: [Wan-AI/Wan2.1-T2V-1.3B-Diffusers](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B-Diffusers)
2. Download Training Datasets and Pose Pool
```bash
cd datasets
bash setup_datasets.sh
```
This will download and prepare:
- Training datasets (open-source + cartoon): `datasets/opensource_dataset/`
- Pose pool for face enhancement: `datasets/opensource_pose_pool/`
<details>
<summary>Manual Download Links</summary>
- [opensource_dataset](https://huggingface.co/datasets/MochunniaN1/One-to-All-sub/tree/main/opensource_dataset)
- [opensource_pose_pool](https://huggingface.co/datasets/MochunniaN1/One-to-All-sub/tree/main/opensource_pose_pool)
</details>
3. Training
We provide three-stage training scripts:
* Stage 1: Reference Extractor
```bash
cd video-generation
bash training_scripts/train1.3b_only_refextractor_2d.sh
# Convert checkpoint to FP32
cd outputs_wanx1.3b/train1.3b_only_refextractor_2d/checkpoint-xxx
mkdir fp32_model_xxx
python zero_to_fp32.py . fp32_model_xxx --safe_serialization
# Run inference (update model path in inference_refextractor.py first)
cd ../../../
# Edit inference_refextractor.py and change ckpt_path to:
# ./outputs_wanx1.3b/train1.3b_only_refextractor_2d/checkpoint-xxx/fp32_model_xxx
python inference_refextractor.py
```
* Stage 2: Pose Control
```bash
bash training_scripts/train1.3b_posecontrol_prefix_2d.sh
```
* Stage 3: Token Replace for Long video generation
```bash
bash training_scripts/train1.3b_posecontrol_prefix_2d_tokenreplace.sh
```
> 💡 **Training Notes**:
> - **Each stage uses different training resolutions** - check the scripts for specific resolution settings
> - **Fine-tuning from our checkpoints**: If you want to continue training from our pre-trained models, directly use the *Stage 3 script* and modify the checkpoint path
<br>
## 📊 Reproduce Paper Results
We provide scripts to reproduce the quantitative results reported in our paper.
1. Download Benchmark
```bash
cd benchmark
bash setup_datasets.sh
```
2. Prepare Model Input
```bash
cd ../video-generation
python reproduce/infer_preprocess.py
```
3. Run Inference
We provide inference scripts for different model sizes and datasets:
```bash
# TikTok dataset
python reproduce/inference_tiktok1.3b.py # 1.3B model
python reproduce/inference_tiktok14b.py # 14B model
# Cartoon dataset
python reproduce/inference_cartoon1.3b.py # 1.3B model
python reproduce/inference_cartoon14b.py # 14B model
4. Prepare gt/pred pairs for Judge
```bash
cd ../benchmark
# TikTok dataset
python prepare_eval_frames_tiktok.py
# Cartoon dataset
python prepare_eval_frames_cartoon.py
```
5. Run judge
```bash
# prepare DisCo environment and lpips fvd ckpt for judge
cd DisCo
# TikTok dataset
bash eval_tiktok.sh
python summary.py
```
# 一对多动画:无对齐角色动画与图像姿态迁移
本仓库包含与论文《[一对多动画:无对齐角色动画与图像姿态迁移](https://huggingface.co/papers/2511.22940)》配套的示例训练数据与基准测试集。
该论文提出了一种统一框架,可针对任意布局的参考样本实现高保真角色动画与图像姿态迁移,通过创新性技术解决空间错位与参考图像局部可见性问题。
- 🌐 [项目主页](https://ssj9596.github.io/one-to-all-animation-project/)
- 💻 [GitHub仓库](https://github.com/ssj9596/One-to-All-Animation)
## 🌟 核心亮点
我们提供了**完整可复现**的训练与评估流程:
- ✅ **完整训练代码**:支持从零开始的三阶段渐进式训练
- ✅ **完整基准测试集**:包含复现代码与预训练模型检查点(Checkpoint)
- ✅ **灵活的训练代码库**:支持多分辨率、多宽高比与多帧训练
- ✅ **数据集**:经过预处理的开源数据集 + 自研卡通数据集
## ☕️ 快速推理(示例用法)
如需使用模型进行快速推理,请遵循[GitHub仓库](https://github.com/ssj9596/One-to-All-Animation)中的以下步骤:
### 🔧 依赖安装与环境配置
1. 克隆仓库
bash
git clone https://github.com/ssj9596/One-to-All-Animation.git
cd One-to-All-Animation
2. 创建Conda环境并安装依赖
bash
# 创建新的Conda环境
conda create -n one-to-all python=3.12
conda activate one-to-all
# 安装PyTorch(PyTorch)
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
# 或
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 -i https://mirrors.aliyun.com/pypi/simple/
# 安装Python依赖包
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
# (推荐)从源码安装闪光注意力3(或2)(Flash Attention):
# https://github.com/Dao-AILab/flash-attention
## 🎬 从零开始训练
> 💡 **需补充数据收集**:当前开源数据集不足以支撑从零开始的训练,我们强烈建议额外收集**至少3000份高质量视频样本**以获得更佳效果。
我们将训练过程拆分为多个步骤,以帮助你从零开始复现论文结果(以1.3B参数量模型为例)。
1. 下载预训练模型
从HuggingFace下载基础模型:[Wan-AI/Wan2.1-T2V-1.3B-Diffusers](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B-Diffusers)
2. 下载训练数据集与姿态池
bash
cd datasets
bash setup_datasets.sh
该脚本将下载并准备以下内容:
- 训练数据集(开源数据集 + 卡通数据集):`datasets/opensource_dataset/`
- 用于人脸增强的姿态池:`datasets/opensource_pose_pool/`
<details>
<summary>手动下载链接</summary>
- [opensource_dataset](https://huggingface.co/datasets/MochunniaN1/One-to-All-sub/tree/main/opensource_dataset)
- [opensource_pose_pool](https://huggingface.co/datasets/MochunniaN1/One-to-All-sub/tree/main/opensource_pose_pool)
</details>
3. 模型训练
我们提供了三阶段训练脚本:
* 阶段1:参考提取器
bash
cd video-generation
bash training_scripts/train1.3b_only_refextractor_2d.sh
# 将模型检查点转换为FP32格式
cd outputs_wanx1.3b/train1.3b_only_refextractor_2d/checkpoint-xxx
mkdir fp32_model_xxx
python zero_to_fp32.py . fp32_model_xxx --safe_serialization
# 运行推理(请先修改inference_refextractor.py中的模型路径)
cd ../../../
# 编辑inference_refextractor.py,将ckpt_path修改为:
# ./outputs_wanx1.3b/train1.3b_only_refextractor_2d/checkpoint-xxx/fp32_model_xxx
python inference_refextractor.py
* 阶段2:姿态控制
bash
bash training_scripts/train1.3b_posecontrol_prefix_2d.sh
* 阶段3:长视频生成的Token替换
bash
bash training_scripts/train1.3b_posecontrol_prefix_2d_tokenreplace.sh
> 💡 **训练注意事项**:
> - **各阶段训练分辨率不同**,请查看对应脚本以获取具体分辨率配置
> - **基于预训练模型微调**:若需从我们的预训练模型继续训练,可直接使用**阶段3脚本**并修改模型检查点路径
## 📊 复现论文实验结果
我们提供了复现论文中定量实验结果的脚本。
1. 下载基准测试集
bash
cd benchmark
bash setup_datasets.sh
2. 准备模型输入数据
bash
cd ../video-generation
python reproduce/infer_preprocess.py
3. 运行推理
我们针对不同模型参数量与数据集提供了推理脚本:
bash
# TikTok数据集
python reproduce/inference_tiktok1.3b.py # 1.3B参数量模型
python reproduce/inference_tiktok14b.py # 14B参数量模型
# 卡通数据集
python reproduce/inference_cartoon1.3b.py # 1.3B参数量模型
python reproduce/inference_cartoon14b.py # 14B参数量模型
4. 为评测工具准备真值与预测帧对
bash
cd ../benchmark
# TikTok数据集
python prepare_eval_frames_tiktok.py
# 卡通数据集
python prepare_eval_frames_cartoon.py
5. 运行评测
bash
# 配置DisCo环境与评测所需的lpips、FVD模型权重
cd DisCo
# TikTok数据集
bash eval_tiktok.sh
python summary.py
提供机构:
maas
创建时间:
2025-12-12



