five

CounterStrike_Deathmatch

收藏
魔搭社区2025-12-13 更新2025-08-23 收录
下载链接:
https://modelscope.cn/datasets/Virgo-Internal/CounterStrike_Deathmatch
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains video, action labels, and metadata from the popular video game CS:GO. Past usecases include imitation learning, behavioral cloning, world modeling, video generation. The paper presenting the dataset: __Counter-Strike Deathmatch with Large-Scale Behavioural Cloning__ [Tim Pearce](https://teapearce.github.io/), [Jun Zhu](https://ml.cs.tsinghua.edu.cn/~jun/index.shtml) IEEE Conference on Games (CoG) 2022 [⭐️ Best Paper Award!] ArXiv paper: https://arxiv.org/abs/2104.04258 (Contains some extra experiments not in CoG version) CoG paper: https://ieee-cog.org/2022/assets/papers/paper_45.pdf Four minute introduction video: https://youtu.be/rnz3lmfSHv0 Gameplay examples: https://youtu.be/KTY7UhjIMm4 Code: https://github.com/TeaPearce/Counter-Strike_Behavioural_Cloning The dataset comprises several different subsets of data as described below. You probably only care about the first one (if you want the largest dataset), or the second or third one (if you care about clean expert data). - ```hdf5_dm_july2021_*_to_*.tar``` - each .tar file contains 200 .hdf5 files - total files when unzipped: 5500 - approx size: 700 GB - map: dust2 - gamemode: deathmatch - source: scraped from online servers - ```dataset_dm_expert_dust2/hdf5_dm_july2021_expert_*.hdf5``` - total files when unzipped: 190 - approx size: 24 GB - map: dust2 - gamemode: deathmatch - source: manually created, clean actions - ```dataset_aim_expert/hdf5_aim_july2021_expert_*.hdf5``` - total files when unzipped: 45 - approx size: 6 GB - map: aim map - gamemode: aim mode - source: manually created, clean actions - ```dataset_dm_expert_othermaps/hdf5_dm_nuke_expert_*.hdf5``` - total files when unzipped: 10 - approx size: 1 GB - map: nuke - gamemode: deathmatch - source: manually created, clean actions - ```dataset_dm_expert_othermaps/hdf5_dm_mirage_expert_*.hdf5``` - total files when unzipped: 10 - approx size: 1 GB - map: mirage - gamemode: deathmatch - source: manually created, clean actions - ```dataset_dm_expert_othermaps/hdf5_dm_inferno_expert_*.hdf5``` - total files when unzipped: 10 - approx size: 1 GB - map: mirage - gamemode: deathmatch - source: manually created, clean actions - ```dataset_metadata/currvarsv2_dm_july2021_*_to_*.npy, currvarsv2_dm_july2021_expert_*_to_*.npy, currvarsv2_dm_mirage_expert_1_to_100.npy, currvarsv2_dm_inferno_expert_1_to_100.npy, currvarsv2_dm_nuke_expert_1_to_100.npy, currvarsv2_aim_july2021_expert_1_to_100.npy``` - total files when unzipped: 55 + 2 + 1 + 1 + 1 + 1 = 61 - approx size: 6 GB - map: as per filename - gamemode: as per filename - source: as per filename - ```location_trackings_backup/``` - total files when unzipped: 305 - approx size: 0.5 GB - map: dust2 - gamemode: deathmatch - source: contains metadata used to compute map coverage analysis - **currvarsv2_agentj22** is the agent trained over the full online dataset - **currvarsv2_agentj22_dmexpert20** is previous model finetuned on the clean expert dust2 dataset - **currvarsv2_bot_capture** is medium difficulty built-in bot ### Structure of .hdf5 files (image and action labels -- you probably care about this one): Each file contains an ordered sequence of 1000 frames (~1 minute) of play. This contains screenshots, as well as processed action labels. We chose .hdf5 format for fast dataloading, since a subset of frames can be accessed without opening the full file. The lookup keys are as follows (where i is frame number 0-999) - **frame_i_x**: is the image - **frame_i_xaux**: contains actions applied in previous timesteps, as well as health, ammo, and team. see dm_pretrain_preprocess.py for details, note this was not used in our final version of the agent - **frame_i_y**: contains target actions in flattened vector form; [keys_pressed_onehot, Lclicks_onehot, Rclicks_onehot, mouse_x_onehot, mouse_y_onehot] - **frame_i_helperarr**: in format [kill_flag, death_flag], each a binary variable, e.g. [1,0] means the player scored a kill and did not die in that timestep ### Structure of .npy files (scraped metadata -- you probably don't care about this): Each .npy file contains metadata corresponding to 100 .hdf5 files (as indicated by file name) They are dictionaries with keys of format: file_numi_frame_j for file number i, and frame number j in 0-999 The values are of format **[curr_vars, infer_a, frame_i_helperarr]** where, - **curr_vars**: contains a dictionary of the metadata originally scraped -- see dm_record_data.py for details - **infer_a**: are inferred actions, [keys_pressed,mouse_x,mouse_y,press_mouse_l,press_mouse_r], with mouse_x and y being continuous values and keys_pressed is in string format - **frame_i_helperarr**: is a repeat of the .hdf5 file ## Trained Models Four trained models are provided. There are 'non-stateful' (use during training) and 'stateful' (use at test time) versions of each. Models can be downloaded under ```trained_models.zip```. - ```ak47_sub_55k_drop_d4``` : Pretrained on AK47 sequences only. - ```ak47_sub_55k_drop_d4_dmexpert_28``` : Finetuned on expert deathmatch data. - ```ak47_sub_55k_drop_d4_aimexpertv2_60``` : Finetuned on aim mode expert data. - ```July_remoterun7_g9_4k_n32_recipe_ton96__e14``` : Pretrained on full dataset. ## Other works using the dataset: - __Imitating Human Behaviour with Diffusion Models, ICLR 2023__ https://arxiv.org/abs/2301.10677 Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin - __Diffusion for World Modeling: Visual Details Matter in Atari, NeurIPS 2024__ https://arxiv.org/pdf/2405.12399 Eloi Alonso∗, Adam Jelley∗, Vincent Micheli, Anssi Kanervisto, Amos Storkey, Tim Pearce‡, François Fleuret‡ Tweet here: https://twitter.com/EloiAlonso1/status/1844803606064611771

这个数据集包含来自热门电子游戏《反恐精英:全球攻势》(Counter-Strike: Global Offensive,简称CS:GO)的视频、动作标签与元数据。过往应用场景包括模仿学习、行为克隆、世界建模以及视频生成。 介绍该数据集的论文为《基于大规模行为克隆的反恐精英死斗模式》(Counter-Strike Deathmatch with Large-Scale Behavioural Cloning),作者为[Tim Pearce](https://teapearce.github.io/)、[朱骏](https://ml.cs.tsinghua.edu.cn/~jun/index.shtml),发表于2022年IEEE游戏会议(IEEE Conference on Games, CoG 2022)并荣获**最佳论文奖**。相关预印本论文可在ArXiv获取:https://arxiv.org/abs/2104.04258(包含部分未收录于CoG版本的额外实验);CoG会议论文链接:https://ieee-cog.org/2022/assets/papers/paper_45.pdf。另有4分钟介绍视频:https://youtu.be/rnz3lmfSHv0,游戏实机演示视频:https://youtu.be/KTY7UhjIMm4,配套代码仓库:https://github.com/TeaPearce/Counter-Strike_Behavioural_Cloning。 该数据集包含多个不同的数据子集,详述如下。若需获取最大规模数据集,可优先关注第一个子集;若需要高质量的专家数据,则可选择第二、第三个子集。 - `hdf5_dm_july2021_*_to_*.tar` - 每个.tar压缩包内含200个.hdf5文件 - 解压后总文件数:5500 - 近似大小:700 GB - 游戏地图:dust2(荒漠迷城) - 游戏模式:死斗模式(deathmatch) - 数据来源:从公开游戏服务器爬取所得 - `dataset_dm_expert_dust2/hdf5_dm_july2021_expert_*.hdf5` - 解压后总文件数:190 - 近似大小:24 GB - 游戏地图:dust2(荒漠迷城) - 游戏模式:死斗模式(deathmatch) - 数据来源:人工录制,动作标签干净规范 - `dataset_aim_expert/hdf5_aim_july2021_expert_*.hdf5` - 解压后总文件数:45 - 近似大小:6 GB - 游戏地图:aim map(瞄准练习地图) - 游戏模式:瞄准模式(aim mode) - 数据来源:人工录制,动作标签干净规范 - `dataset_dm_expert_othermaps/hdf5_dm_nuke_expert_*.hdf5` - 解压后总文件数:10 - 近似大小:1 GB - 游戏地图:nuke(核子危机) - 游戏模式:死斗模式(deathmatch) - 数据来源:人工录制,动作标签干净规范 - `dataset_dm_expert_othermaps/hdf5_dm_mirage_expert_*.hdf5` - 解压后总文件数:10 - 近似大小:1 GB - 游戏地图:mirage(殒命大厦) - 游戏模式:死斗模式(deathmatch) - 数据来源:人工录制,动作标签干净规范 - `dataset_dm_expert_othermaps/hdf5_dm_inferno_expert_*.hdf5` - 解压后总文件数:10 - 近似大小:1 GB - 游戏地图:inferno(炼狱小镇) - 游戏模式:死斗模式(deathmatch) - 数据来源:人工录制,动作标签干净规范 - `dataset_metadata/currvarsv2_dm_july2021_*_to_*.npy, currvarsv2_dm_july2021_expert_*_to_*.npy, currvarsv2_dm_mirage_expert_1_to_100.npy, currvarsv2_dm_inferno_expert_1_to_100.npy, currvarsv2_dm_nuke_expert_1_to_100.npy, currvarsv2_aim_july2021_expert_1_to_100.npy` - 解压后总文件数:55 + 2 + 1 + 1 + 1 + 1 = 61 - 近似大小:6 GB - 游戏地图:与文件名对应 - 游戏模式:与文件名对应 - 数据来源:与文件名对应 - `location_trackings_backup/` - 解压后总文件数:305 - 近似大小:0.5 GB - 游戏地图:dust2(荒漠迷城) - 游戏模式:死斗模式(deathmatch) - 数据来源:包含用于计算地图覆盖度分析的元数据,其中: - `currvarsv2_agentj22`:基于完整公开数据集训练得到的智能体 - `currvarsv2_agentj22_dmexpert20`:在干净的死斗模式专家数据集上微调后的前代模型 - `currvarsv2_bot_capture`:中等难度的内置游戏机器人 ### .hdf5文件结构(包含图像与动作标签,为核心使用部分) 每个文件包含1000帧(约1分钟)的有序游戏录像序列,内含游戏截图与预处理后的动作标签。我们选择.hdf5格式以实现快速数据加载,无需加载完整文件即可访问指定帧子集。其查找键如下(其中`i`为帧编号,取值范围0~999): - `frame_i_x`:对应游戏图像 - `frame_i_xaux`:包含前一时间步执行的动作、生命值、弹药量与队伍信息,详见`dm_pretrain_preprocess.py`,注:该字段未在我们最终版本的智能体中使用 - `frame_i_y`:以扁平化向量形式存储的目标动作,格式为`[按键独热编码, 左键单击独热编码, 右键单击独热编码, 鼠标X轴位移独热编码, 鼠标Y轴位移独热编码]` - `frame_i_helperarr`:格式为`[击杀标记, 死亡标记]`,均为二进制变量,例如`[1,0]`代表该时间步玩家完成击杀且未死亡 ### .npy文件结构(包含爬取的元数据,非核心使用部分) 每个.npy文件对应100个.hdf5文件(文件名已标注),其为字典格式,键的格式为`file_numi_frame_j`,其中`i`为文件编号,`j`为该文件内的帧编号(取值范围0~999)。对应的值格式为`[curr_vars, infer_a, frame_i_helperarr]`,各字段说明如下: - `curr_vars`:包含原始爬取的元数据,详见`dm_record_data.py` - `infer_a`:为推断得到的动作,格式为`[按键输入, 鼠标X轴位移, 鼠标Y轴位移, 左键单击, 右键单击]`,其中鼠标位移为连续值,按键输入为字符串格式 - `frame_i_helperarr`:与.hdf5文件内的对应字段完全重复 ## 训练模型 本次发布提供4个预训练模型,每个模型均有「无状态(训练时使用)」与「有状态(测试时使用)」两个版本。模型可通过`trained_models.zip`下载: - `ak47_sub_55k_drop_d4`:仅基于AK47武器的游戏序列预训练 - `ak47_sub_55k_drop_d4_dmexpert_28`:在干净的死斗模式专家数据集上微调后的AK47预训练模型 - `ak47_sub_55k_drop_d4_aimexpertv2_60`:在瞄准模式专家数据集上微调后的AK47预训练模型 - `July_remoterun7_g9_4k_n32_recipe_ton96__e14`:基于完整数据集预训练的模型 ## 其他使用本数据集的研究 1. **《基于扩散模型模仿人类行为》(Imitating Human Behaviour with Diffusion Models, ICLR 2023)**,论文链接:https://arxiv.org/abs/2301.10677,作者:Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin 2. **《用于世界建模的扩散模型:雅达利游戏中视觉细节的重要性》(Diffusion for World Modeling: Visual Details Matter in Atari, NeurIPS 2024)**,论文链接:https://arxiv.org/pdf/2405.12399,作者:Eloi Alonso∗, Adam Jelley∗, Vincent Micheli, Anssi Kanervisto, Amos Storkey, Tim Pearce‡, François Fleuret‡,相关推文链接:https://twitter.com/EloiAlonso1/status/1844803606064611771
提供机构:
maas
创建时间:
2025-08-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作