Beat The Bookie:赔率系列足球数据集,+500,000 场比赛,来自 1,005 个联赛的 32 个博彩公司的 11 年赔率数据集
收藏帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-11169.html
下载链接
链接失效反馈官方服务:
资源简介:
The online sports gambling industry employs teams of data analysts to build forecast models that turn the odds at sports games in their favour. While several betting strategies have been proposed to beat bookmakers, from expert prediction models and arbitrage strategies to odds bias exploitation, their returns have been inconsistent and it remains to be shown that a betting strategy can outperform the online sports betting market. We designed a strategy to beat football bookmakers with their own numbers: "Beating the bookies with their own numbers - and how the online sports betting market is rigged", by Lisandro Kaunitz, Shenjun Zhong and Javier Kreiner. Here, we make the full dataset publicly available to the Kaggle community. We also provide the codes, raw SQL database and the online real-time dashboard that were used for our study on github. Our strategy proved profitable in a 10-year historical simulation using closing odds, a 6-month historical simulation using minute to minute odds, and a 5-month period during which we staked real money with the bookmakers. We would like to challenge the Kaggle community to improve our results: 10 year historical closing odds: 14-months time series odds: The dataset was assembled over months of scraping online sport portals. We hope you enjoy your sports betting simulations (but remember… the house always wins in the end).
在线体育博彩行业会组建数据分析团队,构建预测模型以在体育赛事赔率中占据竞争优势。尽管已有多种旨在击败博彩公司的投注策略被提出——涵盖专家预测模型、套利策略以及赔率偏差利用策略,但此类策略的回报始终缺乏稳定性,目前仍未有投注策略能够稳定超越在线体育博彩市场。我们设计了一套利用博彩公司自身数据击败足球博彩市场的策略:《用博彩公司的数字击败博彩者——兼论在线体育博彩市场的运作逻辑》("Beating the bookies with their own numbers - and how the online sports betting market is rigged"),作者为Lisandro Kaunitz、Shenjun Zhong与Javier Kreiner。本次研究中,我们将完整数据集公开分享给Kaggle社区,并在GitHub平台上提供了本研究使用的代码、原始SQL数据库与在线实时仪表盘。我们的策略在三项验证实验中均实现了盈利:一是基于终盘赔率的10年历史模拟,二是基于逐分钟更新赔率的6个月历史模拟,三是我们使用真实资金与博彩公司进行的为期5个月的实际投注操作。我们在此向Kaggle社区发起挑战,期待各位能够优化我们的研究结果:10年历史终盘赔率数据集:14个月时间序列赔率数据集:本数据集通过数月对在线体育门户网站的爬虫采集构建而成。我们希望您能享受体育博彩模拟实验(但请牢记……最终终究是庄家稳赢)。
提供机构:
帕依提提
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个专注于足球比赛赔率的综合性数据集,包含超过50万场比赛,覆盖全球1005个联赛和32个博彩公司,时间跨度达11年。数据集分为两部分:一是10年历史收盘赔率数据(2005-2015年),包含约47.9万场比赛;二是14个月的时间序列赔率数据(2015-2016年),包含约9.2万场比赛的每小时赔率变化。该数据集旨在支持体育博彩预测模型的研究和开发,适用于分析赔率趋势、构建投注策略等应用场景。
以上内容由遇见数据集搜集并总结生成



