Robust Sequential Decision-Making in Adversarial Environments: Datasets and results

Name: Robust Sequential Decision-Making in Adversarial Environments: Datasets and results
Creator: figshare
Published: 2025-12-04 16:50:10
License: 暂无描述

DataCite Commons2025-12-04 更新2026-04-25 收录

下载链接：

https://figshare.com/articles/dataset/Robust_Sequential_Decision-Making_in_Adversarial_Environments_Datasets_and_results/30656774/1

下载链接

链接失效反馈

官方服务：

资源简介：

This repository contains experimental datasets, results and configuration files supporting the research article "Robust Sequential Decision-Making in Adversarial Environments". The associated study addresses reinforcement learning in non-stationary, adversarial environments where standard Markov Decision Process (MDP) assumptions are violated, introducing a model-based framework for Threatened Markov Decision Process (TMDP) that utilises Bayesian belief updates to compute robust policies. The provided repository contains the empirical data collected from 1,000 independent trials, each comprising 40,000 games, documenting the cumulative rewards per episode and terminal outcome ratios (win/loss/draw). These resources are provided to facilitate reproducibility of the primary findings.<br><br>The source code used to generate this data is available at link.

本仓库包含支撑研究论文《对抗环境下的鲁棒序贯决策》的实验数据集、实验结果与配置文件。本关联研究聚焦于非平稳对抗环境下的强化学习问题——此类环境违背了标准马尔可夫决策过程（Markov Decision Process, MDP）的假设，为此提出了一种基于模型的受威胁马尔可夫决策过程（Threatened Markov Decision Process, TMDP）框架，该框架借助贝叶斯信念更新来计算鲁棒策略。本仓库提供的实验数据源自1000次独立试验，每次试验包含40000局对局，记录了每一局的累计奖励以及对局终端结果的占比（胜/负/平）。公开上述资源旨在助力核心研究发现的可复现性。<br><br>用于生成该数据集的源代码可在 link 处获取。

提供机构：

figshare

创建时间：

2025-12-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集