aditijc/snooker-testbed-legacy-ppo-v1
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/aditijc/snooker-testbed-legacy-ppo-v1
下载链接
链接失效反馈官方服务:
资源简介:
这是一个关于斯诺克PPO(Proximal Policy Optimization)训练运行的遗留数据集。数据集包含321行和8列,记录了训练过程中的评估指标,如平均得分、最大得分、平均击球次数、平均效率、平均犯规率等。训练运行在2026年4月15日至17日期间进行,使用了stable-baselines3 PPO算法,但策略未能学会有效打球,平均得分极低且犯规率极高。该数据集保存用于与未来的Phase-2重构进行对比分析。
This is a legacy dataset of a snooker PPO (Proximal Policy Optimization) training run. The dataset contains 321 rows and 8 columns, recording evaluation metrics during the training process, such as mean score, max score, mean shots, mean efficiency, mean foul rate, etc. The training run was conducted between April 15 and 17, 2026, using the stable-baselines3 PPO algorithm, but the policy failed to learn to play effectively, with extremely low mean scores and high foul rates. The dataset is preserved for ablation comparison against the forthcoming Phase-2 refactor.
提供机构:
aditijc



