five

European Soccer Database

收藏
www.kaggle.com2016-10-23 更新2025-01-08 收录
下载链接:
https://www.kaggle.com/hugomathien/soccer
下载链接
链接失效反馈
官方服务:
资源简介:
The ultimate Soccer database for data analysis and machine learning ------------------------------------------------------------------- **What you get:** - +25,000 matches - +10,000 players - 11 European Countries with their lead championship - Seasons 2008 to 2016 - Players and Teams' attributes* sourced from EA Sports' FIFA video game series, including the weekly updates - Team line up with squad formation (X, Y coordinates) - Betting odds from up to 10 providers - Detailed match events (goal types, possession, corner, cross, fouls, cards etc...) for +10,000 matches **16th Oct 2016: New table containing teams' attributes from FIFA !* ---------- **Original Data Source:** You can easily find data about soccer matches but they are usually scattered across different websites. A thorough data collection and processing has been done to make your life easier. **I must insist that you do not make any commercial use of the data**. The data was sourced from: - [http://football-data.mx-api.enetscores.com/][1] : scores, lineup, team formation and events - [http://www.football-data.co.uk/][2] : betting odds. [Click here to understand the column naming system for betting odds:][3] - [http://sofifa.com/][4] : players and teams attributes from EA Sports FIFA games. *FIFA series and all FIFA assets property of EA Sports.* > When you have a look at the database, you will notice foreign keys for > players and matches are the same as the original data sources. I have > called those foreign keys "api_id". ---------- **Improving the dataset:** You will notice that some players are missing from the lineup (NULL values). This is because I have not been able to source their attributes from FIFA. This will be fixed overtime as the crawling algorithm is being improved. The dataset will also be expanded to include international games, national cups, Champion's League and Europa League. Please ask me if you're after a specific tournament. > Please get in touch with me if you want to help improve this dataset. [CLICK HERE TO ACCESS THE PROJECT GITHUB][5] *Important note for people interested in using the crawlers:* since I first wrote the crawling scripts (in python), it appears sofifa.com has changed its design and with it comes new requirements for the scripts. The existing script to crawl players ('Player Spider') will not work until i've updated it. ---------- Exploring the data: Now that's the fun part, there is a lot you can do with this dataset. I will be adding visuals and insights to this overview page but please have a look at the kernels and give it a try yourself ! Here are some ideas for you: **The Holy Grail...** ... is obviously to predict the outcome of the game. The bookies use 3 classes (Home Win, Draw, Away Win). They get it right about 53% of the time. This is also what I've achieved so far using my own SVM. Though it may sound high for such a random sport game, you've got to know that the home team wins about 46% of the time. So the base case (constantly predicting Home Win) has indeed 46% precision. **Probabilities vs Odds** When running a multi-class classifier like SVM you could also output a probability estimate and compare it to the betting odds. Have a look at your variance vs odds and see for what games you had very different predictions. **Explore and visualize features** With access to players and teams attributes, team formations and in-game events you should be able to produce some interesting insights into [The Beautiful Game][6] . Who knows, Guardiola himself may hire one of you some day! [1]: http://football-data.mx-api.enetscores.com/ [2]: http://www.football-data.co.uk/ [3]: http://www.football-data.co.uk/notes.txt [4]: http://sofifa.com/ [5]: https://github.com/hugomathien/football-data-collection/tree/master/footballData [6]: https://en.wikipedia.org/wiki/The_Beautiful_Game

终极足球数据库,适用于数据分析和机器学习研究。 **您将获得:** - 超过25,000场比赛 - 超过10,000名球员信息 - 11个欧洲国家的顶级联赛及其领导权 - 2008至2016赛季数据 - 球员和球队属性(*源自EA Sports的FIFA电子游戏系列,包括每周更新) - 球队阵容及阵型配置(X, Y坐标) - 来自多达10家的博彩赔率 - 超过10,000场比赛的详细比赛事件(进球类型、控球率、角球、任意球、犯规、红黄牌等...) **2016年10月16日更新:** 新增包含FIFA球队属性的表格! **原始数据来源:** 您可以在不同的网站上轻松找到关于足球比赛的数据,但它们通常分散在不同的网站上。经过彻底的数据收集和处理,以便让您的生活更加便捷。**我必须强调,您不得将数据用于任何商业用途**。数据来源于以下网站: - [http://football-data.mx-api.enetscores.com/][1]:得分、阵容、球队阵型和事件 - [http://www.football-data.co.uk/][2]:博彩赔率。[点击此处了解博彩赔率的列命名系统:][3] - [http://sofifa.com/][4]:球员和球队属性来自EA Sports FIFA游戏。*FIFA系列和所有FIFA资产均为EA Sports所有。 **数据探索:** 现在是最有趣的部分,您可以使用这个数据集做很多事情。我将在本概述页面上添加可视化和洞察力,但请查看内核并亲自尝试!以下是一些建议: **神圣的圣杯...** 显然是预测比赛结果。博彩公司使用3个类别(主队胜、平局、客队胜)。他们大约有53%的正确率。虽然对于一个看似随机的运动游戏来说,这个数字可能听起来很高,但您必须知道主队大约有46%的胜率。所以基线(始终预测主队胜利)的确有46%的精确率。 **概率与赔率比较** 当运行多类分类器如SVM时,您还可以输出概率估计并比较博彩赔率。查看您的方差与赔率,看看您在哪些比赛中预测结果与赔率差异很大。 **探索和可视化特征** 有了球员和球队属性、球队阵容和比赛中事件的数据,您应该能够对[这项美丽的运动][6]产生一些有趣的见解。谁知道呢,瓜迪奥拉本人可能有一天会雇佣你们中的某个人! [1]: http://football-data.mx-api.enetscores.com/ [2]: http://www.football-data.co.uk/ [3]: http://www.football-data.co.uk/notes.txt [4]: http://sofifa.com/ [5]: https://github.com/hugomathien/football-data-collection/tree/master/footballData [6]: https://en.wikipedia.org/wiki/The_Beautiful_Game
提供机构:
www.kaggle.com
搜集汇总
背景与挑战
背景概述
European Soccer Database是一个全面的足球数据集,包含25,000多场比赛和10,000多名球员的信息,覆盖11个欧洲国家的顶级联赛(2008-2016年)。数据集整合了比赛数据、投注赔率、球员和球队属性(来自FIFA游戏),并提供了详细的比赛事件和阵容信息,适合用于数据分析和机器学习任务。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作