milkkarten/pokemon-showdown-replays-merged
收藏Hugging Face2025-12-31 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/milkkarten/pokemon-showdown-replays-merged
下载链接
链接失效反馈官方服务:
资源简介:
# Pokemon Showdown Replays - Merged Dataset
A merged dataset of Pokemon Showdown battle replays from multiple sources.
## Statistics
- **Total Replays**: 29,057,184
- **Sources**:
- sethkarten: 3,909,792 replays
- metamon: 1,978,010 replays
- holidayougi: 23,169,382 replays
## Top Formats
| Format | Count |
|--------|-------|
| [Gen 9] OU | 3,200,571 |
| [Gen 6] OU | 2,977,289 |
| [Gen 7] OU | 2,578,070 |
| [Gen 7] RANDOMBATTLE | 2,005,704 |
| [Gen 6] RANDOMBATTLE | 1,598,899 |
| [Gen 9] VGC 2025 | 836,958 |
| [Gen 9] National Dex | 714,872 |
| [Gen 7] ANYTHINGGOES | 712,081 |
| [Gen 9] RANDOMBATTLE | 703,345 |
| [Gen 9] NATIONALDEX OU | 678,509 |
## Filters Applied
- Removed games with <= 4 turns
- Removed games with >= 96 turns
- Deduplicated across all sources using content-based hashing
## Schema
Each replay contains:
- `id`: Unique replay identifier
- `format`: Battle format (e.g., "gen9ou")
- `players`: Player usernames
- `rating`: Average rating of players (if available)
- `log`: Full battle log
- `turns`: Number of turns in the battle
- `source`: Origin dataset (sethkarten, metamon, holidayougi)
## Usage
```python
from datasets import load_dataset
# Load the full dataset
dataset = load_dataset("milkkarten/pokemon-showdown-replays-merged")
# Stream for memory efficiency
dataset = load_dataset("milkkarten/pokemon-showdown-replays-merged", streaming=True)
# Filter by format
gen9ou = dataset.filter(lambda x: "gen9ou" in x["format"].lower())
```
## License
Data collected from Pokemon Showdown public replays.
提供机构:
milkkarten



