five

erickfm/mimic-melee-by-character

收藏
Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/erickfm/mimic-melee-by-character
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: cc0-1.0 tags: - melee - smash-bros - slippi - imitation-learning - controller-inputs - fighting-games - pytorch pretty_name: MIMIC Melee (By Character) size_categories: - 100K<n<1M --- # MIMIC Melee — By Character Per-character split of [erickfm/mimic-melee](https://huggingface.co/datasets/erickfm/mimic-melee). Each character has its own subdirectory with ready-to-train PyTorch tensor shards, enabling character-specific imitation-learning models for Super Smash Bros. Melee. ## Source Built from [erickfm/mimic-melee](https://huggingface.co/datasets/erickfm/mimic-melee) using `tools/split_by_character.py` in the [MIMIC](https://github.com/erickfm/MIMIC) repo. Each game is assigned to a character based on the majority `self_character` value across frames. All preprocessing (normalization, categorical encoding, stick discretization) is inherited from the source dataset — the data is identical, just partitioned. ## Dataset statistics | Split | Games | Frames | |-------|-------|--------| | Train | 169,575 | 1,631,777,124 | | Val | 18,841 | 180,971,668 | | **Total** | **188,416** | **1,812,748,792** | - **Characters:** 26 (all playable characters with tournament representation) - **Total size:** ~2.59 TB - **Shard size:** ~1-2 GB each - **Val split:** 10% (seed 42) ## Character distribution | Character | Games | Train Frames | % of Data | |-----------|-------|-------------|-----------| | FOX | 47,539 | 443,452,345 | 27.2% | | FALCO | 32,007 | 291,899,230 | 17.8% | | MARTH | 25,113 | 244,108,914 | 15.0% | | CPTFALCON | 22,481 | 206,274,159 | 12.6% | | SHEIK | 11,879 | 123,885,020 | 7.6% | | PEACH | 6,137 | 66,668,337 | 4.1% | | JIGGLYPUFF | 4,613 | 47,421,314 | 2.9% | | SAMUS | 3,188 | 36,084,503 | 2.2% | | ICE_CLIMBERS | 2,350 | 27,021,337 | 1.7% | | GANONDORF | 2,665 | 24,107,666 | 1.5% | | LUIGI | 2,236 | 22,938,883 | 1.4% | | PIKACHU | 1,500 | 16,196,379 | 1.0% | | YOSHI | 1,435 | 15,183,083 | 0.9% | | DOC | 1,179 | 12,019,765 | 0.8% | | DK | 960 | 10,328,792 | 0.6% | | LINK | 706 | 7,530,259 | 0.5% | | MARIO | 647 | 6,756,811 | 0.4% | | GAMEANDWATCH | 564 | 5,227,122 | 0.3% | | ZELDA | 502 | 4,972,844 | 0.3% | | KIRBY | 366 | 4,021,788 | 0.2% | | YLINK | 364 | 3,739,326 | 0.2% | | ROY | 368 | 3,662,011 | 0.2% | | NESS | 271 | 3,136,275 | 0.2% | | MEWTWO | 228 | 2,434,987 | 0.2% | | BOWSER | 166 | 1,678,142 | 0.1% | | PICHU | 111 | 1,027,832 | 0.1% | ## Directory structure ``` ├── characters.json # Index of all characters with game/frame counts ├── FOX/ │ ├── tensor_manifest.json # Shard list and counts for this character │ ├── norm_stats.json # Per-column normalization (shared across characters) │ ├── cat_maps.json # Categorical mappings (shared) │ ├── stick_clusters.json # Stick cluster centers (shared) │ ├── train_shard_000.pt │ ├── train_shard_001.pt │ ├── ... │ └── val_shard_000.pt ├── FALCO/ │ └── ... └── ... ``` ## Shard format Same as [erickfm/mimic-melee](https://huggingface.co/datasets/erickfm/mimic-melee). Each `.pt` file contains: ```python { "states": {feature_name: Tensor}, # normalized game-state features "targets": {head_name: Tensor}, # controller-input targets "offsets": [int, ...], # game boundary indices "n_games": int, } ``` ## Usage Download a single character: ```python from huggingface_hub import snapshot_download snapshot_download( "erickfm/mimic-melee-by-character", repo_type="dataset", local_dir="data/fox", allow_patterns=["FOX/*"], ) ``` Then train: ```bash python train.py --data-dir data/fox/FOX --model medium --seq-len 60 ``` ## Related - [erickfm/mimic-melee](https://huggingface.co/datasets/erickfm/mimic-melee) — Full dataset (all characters combined) - [erickfm/mimic-melee-subset](https://huggingface.co/datasets/erickfm/mimic-melee-subset) — Small subset for quick experiments - [MIMIC](https://github.com/erickfm/MIMIC) — Imitation-learning bot trained on this data - [Slippi](https://slippi.gg/) — Melee netplay client ## License CC0 1.0 — Public domain.
提供机构:
erickfm
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作