nvidia/Kimodo-Motion-Gen-Benchmark
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/nvidia/Kimodo-Motion-Gen-Benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
language:
- en
tags:
- Kimodo
- nvidia
- human motion generation
pretty_name: Kimodo Human Motion Generation Benchmark
viewer: false
---
# Kimodo Human Motion Generation Benchmark
[Kimodo Codebase](https://github.com/nv-tlabs/kimodo), [Benchmark Documentation](https://research.nvidia.com/labs/sil/projects/kimodo/docs/benchmark/introduction.html)
## Dataset Description:
This dataset provides the necessary metadata to construct the suite of test cases that make up the [Kimodo](https://github.com/nv-tlabs/kimodo) human motion generation benchmark. This includes test cases that evaluate text-following for the text-to-motion task, along with constraint-following for constraint-conditioned motion generation.
The benchmark is constructed from the SOMA uniform version of the [BONES-SEED dataset](https://huggingface.co/datasets/bones-studio/seed) using metadata included in this repo, which includes text prompts derived from our [SEED timeline annotations](https://huggingface.co/datasets/nvidia/SEED-Timeline-Annotations), start/end frames for test case motions, and pose constraint configurations.
To get started with the benchmark, see the [Kimodo documentation](https://research.nvidia.com/labs/sil/projects/kimodo/docs/benchmark/introduction.html).
This repo also includes the train and test splits for the BONES-SEED data that should be used to train models that will be evaluated with the Kimodo benchmark.
This dataset is ready for commercial use.
## Dataset Owner:
NVIDIA Corporation
## Dataset Creation Date:
April 2026
## License/Terms of Use:
This dataset is governed by the [Creative Commons Attribution 4.0 International License](https://creativecommons.org/licenses/by/4.0/) (CC BY 4.0).
## Intended Usage:
This dataset is intended for researchers and developers training motion generation models on BONES-SEED to evaluate their models in a comprehensive and standardized way.
## Download Instructions:
The easiest way to download the dataset is using Git:
```
git clone git@hf.co:datasets/nvidia/Kimodo-Motion-Gen-Benchmark
```
## Dataset Structure:
For full details of the benchmark, please see the [Kimodo documentation](https://research.nvidia.com/labs/sil/projects/kimodo/docs/benchmark/introduction.html).
Train and test splits are defined in `splits`:
- `train_split_paths.txt` - filenames of training data
- `test_content_split_paths.txt` - filenames for test split containing new semantic "content". This split contains motions with `content_name` (from the BONES-SEED metadata) that are not seen in the training split. This tests model generalization to new semantic motion types, e.g. for text-to-motion generalization.
- `test_repetition_split_paths.txt` - filenames for test split containing new motions from content that was seen in training. This split contains motions where the `content_name` is contained in the training split, but the exact motion itself was not seen. This tests a model's ability to generalize to novel performances of a familiar motion type, e.g., for constraint-following generalization.
The benchmark test cases are partitioned into the `content` and `repetition` test splits. Note that the test cases in the benchmark cover a diverse subset of these splits. Within each of the split directories, the structure categorizes test cases into tasks ranging from pure text-to-motion to constraint-conditioned generation with a variety of different constraint types and scenarios:
```text
testsuite
├── content
│ ├── constraints_notext
│ │ ├── end-effectors
│ │ ├── fullbody
│ │ ├── mixture
│ │ └── root
│ ├── constraints_withtext
│ │ ├── end-effectors
│ │ ├── fullbody
│ │ ├── mixture
│ │ └── root
│ └── text2motion
│ ├── overview
│ ├── timeline_multi
│ └── timeline_single
└── repetition
├── constraints_notext
│ ├── end-effectors
│ ├── fullbody
│ ├── mixture
│ └── root
├── constraints_withtext
│ ├── end-effectors
│ ├── fullbody
│ ├── mixture
│ └── root
└── text2motion
├── overview
├── timeline_multi
└── timeline_single
```
At the lowest level of this structure, each leaf folder contains indexed test cases (`0000`, `0001`, `0002`, ...).
For example:
```text
end-effectors/feet_posrot/
├── 0000/
├── 0001/
├── 0002/
...
└── 0255/
```
Each index folder is one standalone test case with its own `meta.json` containing text prompt and duration, `seed_motion.json` with metadata relating the test case to BONES-SEED, and optionally `seed_constraints.json` defining constraints for the test case. These are used to process the BONES-SEED dataset to build the full benchmark.
All text prompts in the benchmark are derived from our [SEED timeline annotations](https://huggingface.co/datasets/nvidia/SEED-Timeline-Annotations).
## Dataset Quantification:
- Dataset file size: 116 MB
- Total number of test cases: 22,474
## References:
- [Kimodo Motion Generation Model](https://research.nvidia.com/labs/sil/projects/kimodo/)
- [BONES-SEED dataset](https://huggingface.co/datasets/bones-studio/seed)
## Ethical Considerations:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal developer teams to ensure this dataset meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://app.intigriti.com/programs/nvidia/nvidiavdp/detail).
提供机构:
nvidia
搜集汇总
数据集介绍

构建方式
Kimodo-Motion-Gen-Benchmark 数据集源自 BONES-SEED 数据集,其构建依赖于详尽的元数据体系。这些元数据包括基于 SEED 时间线注释生成的文本提示、用于界定运动序列的起始与终止帧,以及姿态约束配置。通过整合这些要素,数据集系统性地组装出涵盖文本驱动运动生成与条件约束运动生成两大任务的测试用例库,并进一步将测试用例划分为内容分片与重复分片,分别评估模型对新语义动作的泛化能力与对已知动作类型的泛化表现。
特点
该数据集的核心特色在于其结构化、多维度的评测体系。它既支持纯粹文本到运动的生成任务,也覆盖了无文本约束、带文本约束的多种约束类型,包括末端效应器、全身姿态、混合约束及根部轨迹等场景。每个测试用例均独立封装,包含元数据、种子运动信息及可选的约束定义,总计超过两万两千个精心设计的测试用例,为运动生成模型的标准化评估提供了坚实且全面的基础。
使用方法
使用该数据集时,研究者应首先通过 Git 克隆仓库获取元数据及训练/测试集划分文件。随后,需结合 BONES-SEED 数据集,依据测试用例文件夹内的 `meta.json`、`seed_motion.json` 及可选的 `seed_constraints.json` 文件,在 Kimodo 基准框架下处理并构建完整的评测流程。官方文档提供了详尽的使用指南,以辅助用户高效、准确地调用这些测试用例对模型性能进行系统评估。
背景与挑战
背景概述
Kimodo-Motion-Gen-Benchmark是由英伟达(NVIDIA)于2026年4月创建的一项具有里程碑意义的人类动作生成基准测试集,旨在系统评估基于文本或约束条件的动作生成模型的性能。该基准构建于BONES-SEED数据集之上,并融合了SEED时间轴注释中的丰富文本提示、动作起止帧及姿态约束配置,形成了涵盖文本到动作、约束到动作等多种任务类型的标准化测试套件。通过将测试案例划分为反映新语义内容与已知内容新动作的两大子集,这一基准有效推动了对模型泛化能力的精细评测,为动作生成领域的学术研究与工业应用提供了可信的对比平台与可复现的评估框架。
当前挑战
首先,动作生成领域面临的核心挑战在于如何精确捕捉语义与运动之间的复杂映射,特别是在文本到动作任务中,模型需从多样化文本描述中生成自然、连贯且符合物理约束的人体运动序列。其次,构建该基准时需应对标注一致性与覆盖充分性的矛盾,即确保文本提示的语义多样性并能覆盖广泛的动作类别与约束类型,同时避免数据偏差对模型评估公平性的影响。此外,如何在保证测试案例质量的前提下高效集成来自BONES-SEED的海量元数据,并设计出能区分语义迁移与运动泛化能力的测试子集,构成了数据集构建过程中的主要技术瓶颈。
常用场景
经典使用场景
在人体运动生成这一前沿研究领域中,Kimodo-Motion-Gen-Benchmark作为一套标准化的评测基准,最为经典的使用场景便是对文本驱动运动生成模型的忠实度与泛化能力进行系统性评估。该基准基于BONES-SEED数据集构建,通过精心设计的测试用例,全面考察模型在文本到运动任务中的语义对齐程度,以及在约束条件驱动下的运动生成质量。研究者可利用其内置的文本提示、起止帧标注与姿态约束配置,对模型的指令遵循能力进行多维度、细粒度的量化分析,从而在高保真人体动画合成的算法比较中提供权威的参考标准。
实际应用
在实际应用层面,Kimodo-Motion-Gen-Benchmark所推动的人体运动生成技术具有广阔的场景价值。在虚拟现实与增强现实的内容生产中,该基准有助于评估模型生成的数字人动作是否自然、流畅且符合用户指令,从而提升沉浸式体验的逼真度。在游戏开发与影视特效行业,基于该基准训练的模型可依据剧本文本或导演意图自动生成角色动画,大幅降低动作捕捉的成本与周期。此外,在机器人人机交互与康复训练领域,运动生成模型的约束跟随能力可直接服务于机器人动作规划与个性化康复方案设计,实现从抽象描述到具体物理运动的高效转化。
衍生相关工作
围绕Kimodo-Motion-Gen-Benchmark,学术界已衍生出一系列具有影响力的相关研究。其基础BONES-SEED数据集与SEED时间线标注为运动-语言对齐提供了丰富的注释资源,催生了诸多专注于细粒度时序运动描述与生成的工作。基于该基准所定义的测试范式,研究者相继提出了多模态约束融合的运动生成框架、面向长序列的时序一致性增强模型,以及兼顾文本忠实度与物理可行性的对抗式训练方法。这些工作不仅在基准本身的评测指标上取得了突破,更通过借鉴其双维度泛化评估思想,拓展了运动生成在草图驱动、音乐引导等交叉领域的评估规范,形成了良性循环的科研生态。
以上内容由遇见数据集搜集并总结生成



