five

AntimLabs/FlappyBird-SFT

收藏
Hugging Face2025-12-03 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/AntimLabs/FlappyBird-SFT
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - reinforcement-learning - text-generation tags: - flappy-bird - reinforcement-learning - game-ai - sft --- # FlappyBird SFT Dataset This dataset contains supervised fine-tuning (SFT) data for Flappy Bird game control policy. ## Dataset Details - **Source**: Evaluation results from `AntimLabs/Qwen2.5-1.5B-Instruct-FlappyBird-RL-71` - **Format**: Standard SFT format with `prompt` and `completion` columns - **Examples**: 710 game episodes - **Total conversation turns**: ~147k ## Data Format Each example has two columns: - **prompt**: List of messages containing system prompt and initial user observation - **completion**: List of messages containing the full game conversation (user observations and assistant actions) ```python { "prompt": [ {"role": "system", "content": "You are the Flappy Bird control policy..."}, {"role": "user", "content": "<FLAPPY id=1>..."} ], "completion": [ {"role": "assistant", "content": "<ACTIONS>[]</ACTIONS>"}, {"role": "user", "content": "<FLAPPY id=2>..."}, {"role": "assistant", "content": "<ACTIONS>[TAP]</ACTIONS>"}, ... ] } ``` ## Usage ```python from datasets import load_dataset dataset = load_dataset("AntimLabs/FlappyBird-SFT", split="train") ```

license: MIT协议 task_categories: - 强化学习(Reinforcement Learning) - 文本生成 tags: - Flappy Bird - 强化学习(Reinforcement Learning) - 游戏AI(Game AI) - 监督微调(Supervised Fine-Tuning,SFT) # Flappy Bird 监督微调数据集 本数据集包含用于Flappy Bird游戏控制策略的监督微调(Supervised Fine-Tuning,SFT)数据。 ## 数据集详情 - **数据来源**:来自`AntimLabs/Qwen2.5-1.5B-Instruct-FlappyBird-RL-71`的评估结果 - **数据格式**:采用标准监督微调格式,包含`prompt`与`completion`两列 - **样本数量**:710局游戏对局 - **总对话轮次**:约14.7万 ## 数据格式 每个样本包含两列: - **prompt**:包含系统提示与初始用户观测信息的消息列表 - **completion**:包含完整游戏对话(用户观测与智能体动作)的消息列表 python { "prompt": [ {"role": "system", "content": "You are the Flappy Bird control policy..."}, {"role": "user", "content": "<FLAPPY id=1>..."} ], "completion": [ {"role": "assistant", "content": "<ACTIONS>[]</ACTIONS>"}, {"role": "user", "content": "<FLAPPY id=2>..."}, {"role": "assistant", "content": "<ACTIONS>[TAP]</ACTIONS>"}, ... ] } ## 使用方法 python from datasets import load_dataset dataset = load_dataset("AntimLabs/FlappyBird-SFT", split="train")
提供机构:
AntimLabs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作