marco-costa-ml/balatro-imitation-learning

Name: marco-costa-ml/balatro-imitation-learning
Creator: marco-costa-ml
Published: 2026-04-29 18:16:01
License: 暂无描述

Hugging Face2026-04-29 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/marco-costa-ml/balatro-imitation-learning

下载链接

链接失效反馈

官方服务：

资源简介：

Balatro模仿学习数据集是一个大规模的游戏玩法数据集，专为Balatro游戏中的模仿学习、状态建模和下游视觉研究而设计。该版本包含494.38小时的游戏视频，这些视频已处理成结构化的检测、OCR和从公开可用的游戏画面中提取的游戏状态表示。数据集亮点包括494.38小时的游戏视频、从游戏画面中提取的结构化多模态状态、清理后的检测和OCR、跟踪的对象状态、稳定的OCR以及组合的对象状态。数据集设计用于模仿学习、状态重建和游戏理解任务。数据集组织为清理和派生两部分，清理部分包括清理后的对象检测和OCR输出，派生部分包括时间一致的对象跟踪、稳定的OCR时间序列以及带有附加属性的父子组合对象状态。数据集生成使用了经过微调的视觉模型，这些模型已单独发布。数据集是从公开可用的游戏流中生成的，使用自定义的计算机视觉管道，但不包括原始视频或音频。

The Balatro Imitation Learning Dataset is a large-scale gameplay dataset designed for imitation learning, state modeling, and downstream vision research in the game Balatro. This release contains 494.38 hours of gameplay video, processed into structured detection, OCR, and game-state representations derived from publicly available gameplay footage. Highlights include 494.38 hours of gameplay video, structured multimodal state extracted from gameplay footage, cleaned detections and OCR, tracked object state, stable OCR, and composed object state. The dataset is designed for imitation learning, state reconstruction, and game understanding tasks. It is organized into clean and derived components, with clean including cleaned object detections and OCR outputs, and derived including temporally consistent object tracks, stabilized OCR time series, and parent-child composed object state with attached attributes. The dataset was generated using fine-tuned vision models released separately. Data was generated from publicly available gameplay streams using a custom computer vision pipeline, but the released dataset does not include raw video or audio.

提供机构：

marco-costa-ml

5,000+

优质数据集

54 个

任务类型

进入经典数据集