UFC-Fight historical data from 1993 to 2021
收藏www.kaggle.com2021-03-21 更新2025-01-21 收录
下载链接:
https://www.kaggle.com/rajeevw/ufcdata
下载链接
链接失效反馈官方服务:
资源简介:
### UPDATE
This dataset got a lot of love from the community and I saw many people asking for an updated version, so I have uploaded the latest scraped and processed data ( as of 21/03/2021). Now it's super easy for anyone to get the latest dataset (Just use a single command), so in case you need bleeding-edge data, or you want to see the code, you can look [here](https://github.com/WarrierRajeev/UFC-Predictions). Hope this solves all problems!
If there are any issues with the data, please forgive me and write about it in the comments or raise an issue on github. I will pick it up 👍
Thank you everyone for the emails and messages. As usual, have fun! ❤️ 😁
### Context
This is a list of every UFC fight in the history of the organisation. Every row contains information about both fighters, fight details and the winner. The data was scraped from ufcstats website. After fightmetric ceased to exist, this came into picture. I saw that there was a lot of information on the website about every fight and every event and there were no existing ways of capturing all this. I used beautifulsoup to scrape the data and pandas to process it. It was a long and arduous process, please forgive any mistakes. I have provided the raw files incase anybody wants to process it differently. This is my first time creating a dataset, any suggestions and corrections are welcome! Incase anyone wants to check out the work, I have all uploaded all the code files, including the scraping module [here](https://github.com/WarrierRajeev/UFC-Predictions)
Have fun!
### Content
Each row is a compilation of both fighter stats. Fighters are represented by 'red' and 'blue' (for red and blue corner). So for instance, red fighter has the complied average stats of all the fights except the current one. The stats include damage done by the red fighter on the opponent and the damage done by the opponent on the fighter (represented by 'opp' in the columns) in all the fights this particular red fighter has had, except this one as it has not occured yet (in the data). Same information exists for blue fighter. The target variable is 'Winner' which is the only column that tells you what happened.
Here are some column definitions:
### Column definitions:
- `R_` and `B_` prefix signifies red and blue corner fighter stats respectively
- `_opp_` containing columns is the average of damage done by the opponent on the fighter
- `KD` is number of knockdowns
- `SIG_STR` is no. of significant strikes 'landed of attempted'
- `SIG_STR_pct` is significant strikes percentage
- `TOTAL_STR` is total strikes 'landed of attempted'
- `TD` is no. of takedowns
- `TD_pct` is takedown percentages
- `SUB_ATT` is no. of submission attempts
- `PASS` is no. times the guard was passed?
- `REV` is the no. of Reversals landed
- `HEAD` is no. of significant strinks to the head 'landed of attempted'
- `BODY` is no. of significant strikes to the body 'landed of attempted'
- `CLINCH` is no. of significant strikes in the clinch 'landed of attempted'
- `GROUND` is no. of significant strikes on the ground 'landed of attempted'
- `win_by` is method of win
- `last_round` is last round of the fight (ex. if it was a KO in 1st, then this will be 1)
- `last_round_time` is when the fight ended in the last round
- `Format` is the format of the fight (3 rounds, 5 rounds etc.)
- `Referee` is the name of the Ref
- `date` is the date of the fight
- `location` is the location in which the event took place
- `Fight_type` is which weight class and whether it's a title bout or not
- `Winner` is the winner of the fight
- `Stance` is the stance of the fighter (orthodox, southpaw, etc.)
- `Height_cms` is the height in centimeter
- `Reach_cms` is the reach of the fighter (arm span) in centimeter
- `Weight_lbs` is the weight of the fighter in pounds (lbs)
- `age` is the age of the fighter
- `title_bout` Boolean value of whether it is title fight or not
- `weight_class` is which weight class the fight is in (Bantamweight, heavyweight, Women's flyweight, etc.)
- `no_of_rounds` is the number of rounds the fight was scheduled for
- `current_lose_streak` is the count of current concurrent losses of the fighter
- `current_win_streak` is the count of current concurrent wins of the fighter
- `draw` is the number of draws in the fighter's ufc career
- `wins` is the number of wins in the fighter's ufc career
- `losses` is the number of losses in the fighter's ufc career
- `total_rounds_fought` is the average of total rounds fought by the fighter
- `total_time_fought(seconds)` is the count of total time spent fighting in seconds
- `total_title_bouts` is the total number of title bouts taken part in by the fighter
- `win_by_Decision_Majority` is the number of wins by majority judges decision in the fighter's ufc career
- `win_by_Decision_Split` is the number of wins by split judges decision in the fighter's ufc career
- `win_by_Decision_Unanimous` is the number of wins by unanimous judges decision in the fighter's ufc career
- `win_by_KO/TKO` is the number of wins by knockout in the fighter's ufc career
- `win_by_Submission` is the number of wins by submission in the fighter's ufc career
- `win_by_TKO_Doctor_Stoppage` is the number of wins by doctor stoppage in the fighter's ufc career
### Acknowledgements
- Inspiration: https://github.com/Hitkul/UFC_Fight_Prediction
Provided ideas on how to store per fight data. Unfortunately, the entire UFC website and fightmetric website changed so couldn't reuse any of the code.
- Print Progress Bar: https://gist.github.com/aubricus/f91fb55dc6ba5557fbab06119420dd6a
To display progress of how much download is complete in the terminal
### Info about me
You can check out who I am and what I do [here](https://rajeevwarrier.com/)
{'UPDATE': '本数据集受到了社区的广泛好评,我看到许多人请求更新版本,因此我已上传了最新的抓取和处理的资料(截至2021年3月21日)。现在,任何人都可以轻松获取最新数据集(只需一个命令即可),因此,如果您需要最新的数据,或者您想查看代码,可以在此处查看[GitHub链接](https://github.com/WarrierRajeev/UFC-Predictions)。希望这能解决所有问题!
如果数据存在问题,请见谅,并在评论中或通过GitHub提出问题。我会及时处理。👍
感谢大家的邮件和消息。一如既往,祝您玩得开心!💖 😁', 'Context': '以下是该组织历史上所有UFC比赛的列表。每一行都包含了两位选手、比赛详情和胜者的信息。这些数据是从ufcstats网站抓取的。在fightmetric停止运营后,这一数据集应运而生。我发现网站上有关于每场战斗和每场活动的丰富信息,但没有现有的方法可以捕捉所有这些信息。我使用了beautifulsoup抓取数据,并使用pandas进行处理。这是一个漫长而艰巨的过程,请原谅任何可能的错误。我提供了原始文件,以便任何人都可以以不同的方式进行处理。这是我第一次创建数据集,任何建议和更正都受欢迎!如果有人想查看这项工作,我已上传了所有代码文件,包括抓取模块[GitHub链接](https://github.com/WarrierRajeev/UFC-Predictions)
祝您玩得开心!', 'Content': "每一行都是两位选手统计数据的汇编。选手通过'红'和'蓝'(代表红角和蓝角)来表示。例如,红方选手的平均统计数据包括了除当前战斗之外的所有战斗中的数据。统计数据包括红方选手对对手造成的伤害和对手对选手造成的伤害(列中用'opp'表示)在所有红方选手参与的战斗中,除了这一场,因为尚未发生(在数据中)。关于蓝方选手,信息相同。目标变量是'Winner',这是唯一一个可以告诉你结果发生了什么的列。
以下是列定义:", 'Column definitions': '在以下列定义中:
- `R_`和`B_`前缀分别表示红角和蓝角选手的统计数据。
- 包含`_opp_`的列表示对手对选手造成的平均伤害。
- `KD`是击倒次数。
- `SIG_STR`是有效打击次数。
- `SIG_STR_pct`是有效打击百分比。
- `TOTAL_STR`是总打击次数。
- `TD`是降服次数。
- `TD_pct`是降服百分比。
- `SUB_ATT`是降服尝试次数。
- `PASS`是防守方多少次突破了对方的防守。
- `REV`是成功反击的次数。
- `HEAD`是对头部造成有效打击的次数。
- `BODY`是对身体造成有效打击的次数。
- `CLINCH`是缠斗中造成有效打击的次数。
- `GROUND`是地面战中造成有效打击的次数。
- `win_by`是获胜方式。
- `last_round`是比赛的最后一轮(例如,如果是在第一轮被TKO,则此值为1)。
- `last_round_time`是比赛在最后一轮结束的时间。
- `Format`是比赛的格式(3回合、5回合等)。
- `Referee`是裁判的名字。
- `date`是比赛日期。
- `location`是比赛发生的地点。
- `Fight_type`是比赛的重量级以及是否为冠军赛。
- `Winner`是比赛的胜者。
- `Stance`是选手的站位(正拳、左撇子等)。
- `Height_cms`是选手的身高(厘米)。
- `Reach_cms`是选手的臂展(厘米)。
- `Weight_lbs`是选手的体重(磅)。
- `age`是选手的年龄。
- `title_bout`是布尔值,表示是否为冠军赛。
- `weight_class`是比赛的重量级(雏量级、重量级、女子蝇量级等)。
- `no_of_rounds`是比赛安排的回合数。
- `current_lose_streak`是选手当前的连续败仗次数。
- `current_win_streak`是选手当前的连续胜利次数。
- `draw`是选手在UFC生涯中的平局次数。
- `wins`是选手在UFC生涯中的胜利次数。
- `losses`是选手在UFC生涯中的败仗次数。
- `total_rounds_fought`是选手所参与的总回合数的平均值。
- `total_time_fought(seconds)`是选手总共参与战斗的时间(秒数)。
- `total_title_bouts`是选手参与的总冠军赛次数。
- `win_by_Decision_Majority`是选手在UFC生涯中通过多数裁判判定获胜的次数。
- `win_by_Decision_Split`是选手在UFC生涯中通过分歧裁判判定获胜的次数。
- `win_by_Decision_Unanimous`是选手在UFC生涯中通过一致裁判判定获胜的次数。
- `win_by_KO/TKO`是选手在UFC生涯中通过击倒获胜的次数。
- `win_by_Submission`是选手在UFC生涯中通过降服获胜的次数。
- `win_by_TKO_Doctor_Stoppage`是选手在UFC生涯中通过医生终止比赛获胜的次数。', 'Acknowledgements': '- 灵感来源:[GitHub链接](https://github.com/Hitkul/UFC_Fight_Prediction),提供了关于如何存储每场战斗数据的想法。遗憾的是,整个UFC网站和fightmetric网站都发生了变化,因此无法重用任何代码。
- 打印进度条:[GitHub链接](https://gist.github.com/aubricus/f91fb55dc6ba5557fbab06119420dd6a),用于在终端显示下载进度。', 'Info about me': '您可以在此处查看我的信息和我的工作[链接](https://rajeevwarrier.com/)。'}
提供机构:
Kaggle



