five

jab13/mlb-statcast-dataset

收藏
Hugging Face2025-11-29 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/jab13/mlb-statcast-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: pitch_type dtype: string - name: game_date dtype: timestamp[ms] - name: release_speed dtype: float64 - name: release_pos_x dtype: float64 - name: release_pos_z dtype: float64 - name: player_name dtype: string - name: batter dtype: int64 - name: pitcher dtype: int64 - name: events dtype: string - name: description dtype: string - name: spin_dir dtype: float64 - name: spin_rate_deprecated dtype: float64 - name: break_angle_deprecated dtype: float64 - name: break_length_deprecated dtype: float64 - name: zone dtype: int64 - name: des dtype: string - name: game_type dtype: string - name: stand dtype: string - name: p_throws dtype: string - name: home_team dtype: string - name: away_team dtype: string - name: type dtype: string - name: hit_location dtype: float64 - name: bb_type dtype: string - name: balls dtype: int64 - name: strikes dtype: int64 - name: game_year dtype: int64 - name: pfx_x dtype: float64 - name: pfx_z dtype: float64 - name: plate_x dtype: float64 - name: plate_z dtype: float64 - name: on_3b dtype: float64 - name: on_2b dtype: float64 - name: on_1b dtype: float64 - name: outs_when_up dtype: int64 - name: inning dtype: int64 - name: inning_topbot dtype: string - name: hc_x dtype: float64 - name: hc_y dtype: float64 - name: tfs_deprecated dtype: float64 - name: tfs_zulu_deprecated dtype: float64 - name: umpire dtype: float64 - name: sv_id dtype: float64 - name: vx0 dtype: float64 - name: vy0 dtype: float64 - name: vz0 dtype: float64 - name: ax dtype: float64 - name: ay dtype: float64 - name: az dtype: float64 - name: sz_top dtype: float64 - name: sz_bot dtype: float64 - name: hit_distance_sc dtype: float64 - name: launch_speed dtype: float64 - name: launch_angle dtype: float64 - name: effective_speed dtype: float64 - name: release_spin_rate dtype: float64 - name: release_extension dtype: float64 - name: game_pk dtype: int64 - name: fielder_2 dtype: int64 - name: fielder_3 dtype: int64 - name: fielder_4 dtype: int64 - name: fielder_5 dtype: int64 - name: fielder_6 dtype: int64 - name: fielder_7 dtype: int64 - name: fielder_8 dtype: int64 - name: fielder_9 dtype: int64 - name: release_pos_y dtype: float64 - name: estimated_ba_using_speedangle dtype: float64 - name: estimated_woba_using_speedangle dtype: float64 - name: woba_value dtype: float64 - name: woba_denom dtype: float64 - name: babip_value dtype: float64 - name: iso_value dtype: float64 - name: launch_speed_angle dtype: float64 - name: at_bat_number dtype: int64 - name: pitch_number dtype: int64 - name: pitch_name dtype: string - name: home_score dtype: int64 - name: away_score dtype: int64 - name: bat_score dtype: int64 - name: fld_score dtype: int64 - name: post_away_score dtype: int64 - name: post_home_score dtype: int64 - name: post_bat_score dtype: int64 - name: post_fld_score dtype: int64 - name: if_fielding_alignment dtype: string - name: of_fielding_alignment dtype: string - name: spin_axis dtype: float64 - name: delta_home_win_exp dtype: float64 - name: delta_run_exp dtype: float64 - name: bat_speed dtype: float64 - name: swing_length dtype: float64 - name: estimated_slg_using_speedangle dtype: float64 - name: delta_pitcher_run_exp dtype: float64 - name: hyper_speed dtype: float64 - name: home_score_diff dtype: int64 - name: bat_score_diff dtype: int64 - name: home_win_exp dtype: float64 - name: bat_win_exp dtype: float64 - name: age_pit_legacy dtype: int64 - name: age_bat_legacy dtype: int64 - name: age_pit dtype: int64 - name: age_bat dtype: int64 - name: n_thruorder_pitcher dtype: int64 - name: n_priorpa_thisgame_player_at_bat dtype: int64 - name: pitcher_days_since_prev_game dtype: float64 - name: batter_days_since_prev_game dtype: float64 - name: pitcher_days_until_next_game dtype: float64 - name: batter_days_until_next_game dtype: float64 - name: api_break_z_with_gravity dtype: float64 - name: api_break_x_arm dtype: float64 - name: api_break_x_batter_in dtype: float64 - name: arm_angle dtype: float64 - name: attack_angle dtype: float64 - name: attack_direction dtype: float64 - name: swing_path_tilt dtype: float64 - name: intercept_ball_minus_batter_pos_x_inches dtype: float64 - name: intercept_ball_minus_batter_pos_y_inches dtype: float64 - name: VRA dtype: float64 - name: HRA dtype: float64 - name: PitchesThrown dtype: int32 - name: IsStrike dtype: int32 - name: IsGB dtype: float64 - name: BatterName dtype: string splits: - name: train num_bytes: 3043941146 num_examples: 2966881 download_size: 441054634 dataset_size: 3043941146 configs: - config_name: default data_files: - split: train path: data/train-* ---

数据集信息: 特征列表: - 字段名称:投球类型(pitch_type),数据类型:字符串(string) - 字段名称:比赛日期(game_date),数据类型:毫秒级时间戳(timestamp[ms]) - 字段名称:出手速度(release_speed),数据类型:双精度浮点数(float64) - 字段名称:出手位置X坐标(release_pos_x),数据类型:双精度浮点数(float64) - 字段名称:出手位置Z坐标(release_pos_z),数据类型:双精度浮点数(float64) - 字段名称:球员姓名(player_name),数据类型:字符串(string) - 字段名称:击球手(batter),数据类型:64位整数(int64) - 字段名称:投手(pitcher),数据类型:64位整数(int64) - 字段名称:赛事事件(events),数据类型:字符串(string) - 字段名称:事件描述(description),数据类型:字符串(string) - 字段名称:旋转方向(spin_dir),数据类型:双精度浮点数(float64) - 字段名称:已弃用旋转速率(spin_rate_deprecated),数据类型:双精度浮点数(float64) - 字段名称:已弃用轨迹角度(break_angle_deprecated),数据类型:双精度浮点数(float64) - 字段名称:已弃用轨迹长度(break_length_deprecated),数据类型:双精度浮点数(float64) - 字段名称:好球带区域(zone),数据类型:64位整数(int64) - 字段名称:事件详情(des),数据类型:字符串(string) - 字段名称:比赛类型(game_type),数据类型:字符串(string) - 字段名称:击球手站位(stand),数据类型:字符串(string) - 字段名称:投手投球臂侧(p_throws),数据类型:字符串(string) - 字段名称:主场球队(home_team),数据类型:字符串(string) - 字段名称:客场球队(away_team),数据类型:字符串(string) - 字段名称:球种类型(type),数据类型:字符串(string) - 字段名称:击球落点(hit_location),数据类型:双精度浮点数(float64) - 字段名称:击球球型(bb_type),数据类型:字符串(string) - 字段名称:坏球数(balls),数据类型:64位整数(int64) - 字段名称:好球数(strikes),数据类型:64位整数(int64) - 字段名称:比赛年份(game_year),数据类型:64位整数(int64) - 字段名称:X方向运动偏移(pfx_x),数据类型:双精度浮点数(float64) - 字段名称:Z方向运动偏移(pfx_z),数据类型:双精度浮点数(float64) - 字段名称:本垒板X坐标(plate_x),数据类型:双精度浮点数(float64) - 字段名称:本垒板Z坐标(plate_z),数据类型:双精度浮点数(float64) - 字段名称:三垒跑者标识(on_3b),数据类型:双精度浮点数(float64) - 字段名称:二垒跑者标识(on_2b),数据类型:双精度浮点数(float64) - 字段名称:一垒跑者标识(on_1b),数据类型:双精度浮点数(float64) - 字段名称:打者上场时出局数(outs_when_up),数据类型:64位整数(int64) - 字段名称:局数(inning),数据类型:64位整数(int64) - 字段名称:半局标识(inning_topbot),数据类型:字符串(string) - 字段名称:击球落地X坐标(hc_x),数据类型:双精度浮点数(float64) - 字段名称:击球落地Y坐标(hc_y),数据类型:双精度浮点数(float64) - 字段名称:已弃用时间戳(tfs_deprecated),数据类型:双精度浮点数(float64) - 字段名称:已弃用祖鲁时间戳(tfs_zulu_deprecated),数据类型:双精度浮点数(float64) - 字段名称:裁判标识(umpire),数据类型:双精度浮点数(float64) - 字段名称:比赛录像ID(sv_id),数据类型:双精度浮点数(float64) - 字段名称:初始X方向速度(vx0),数据类型:双精度浮点数(float64) - 字段名称:初始Y方向速度(vy0),数据类型:双精度浮点数(float64) - 字段名称:初始Z方向速度(vz0),数据类型:双精度浮点数(float64) - 字段名称:X方向加速度(ax),数据类型:双精度浮点数(float64) - 字段名称:Y方向加速度(ay),数据类型:双精度浮点数(float64) - 字段名称:Z方向加速度(az),数据类型:双精度浮点数(float64) - 字段名称:好球带上沿高度(sz_top),数据类型:双精度浮点数(float64) - 字段名称:好球带下沿高度(sz_bot),数据类型:双精度浮点数(float64) - 字段名称:击球飞行距离(hit_distance_sc),数据类型:双精度浮点数(float64) - 字段名称:击球初速度(launch_speed),数据类型:双精度浮点数(float64) - 字段名称:击球仰角(launch_angle),数据类型:双精度浮点数(float64) - 字段名称:有效球速(effective_speed),数据类型:双精度浮点数(float64) - 字段名称:出手旋转速率(release_spin_rate),数据类型:双精度浮点数(float64) - 字段名称:出手延伸距离(release_extension),数据类型:双精度浮点数(float64) - 字段名称:比赛唯一标识(game_pk),数据类型:64位整数(int64) - 字段名称:捕手标识(fielder_2),数据类型:64位整数(int64) - 字段名称:一垒手标识(fielder_3),数据类型:64位整数(int64) - 字段名称:二垒手标识(fielder_4),数据类型:64位整数(int64) - 字段名称:三垒手标识(fielder_5),数据类型:64位整数(int64) - 字段名称:游击手标识(fielder_6),数据类型:64位整数(int64) - 字段名称:左外野手标识(fielder_7),数据类型:64位整数(int64) - 字段名称:中外野手标识(fielder_8),数据类型:64位整数(int64) - 字段名称:右外野手标识(fielder_9),数据类型:64位整数(int64) - 字段名称:出手位置Y坐标(release_pos_y),数据类型:双精度浮点数(float64) - 字段名称:基于击球速度与仰角的预估安打率(estimated_ba_using_speedangle),数据类型:双精度浮点数(float64) - 字段名称:基于击球速度与仰角的预估加权上垒率(estimated_woba_using_speedangle),数据类型:双精度浮点数(float64) - 字段名称:加权上垒率数值(woba_value),数据类型:双精度浮点数(float64) - 字段名称:加权上垒率分母(woba_denom),数据类型:双精度浮点数(float64) - 字段名称:场内安打率(BABIP,babip_value),数据类型:双精度浮点数(float64) - 字段名称:纯长打率数值(iso_value),数据类型:双精度浮点数(float64) - 字段名称:击球速度仰角组合参数(launch_speed_angle),数据类型:双精度浮点数(float64) - 字段名称:打席编号(at_bat_number),数据类型:64位整数(int64) - 字段名称:本次打席投球编号(pitch_number),数据类型:64位整数(int64) - 字段名称:投球名称(pitch_name),数据类型:字符串(string) - 字段名称:主场比分(home_score),数据类型:64位整数(int64) - 字段名称:客场比分(away_score),数据类型:64位整数(int64) - 字段名称:打者方比分(bat_score),数据类型:64位整数(int64) - 字段名称:防守方比分(fld_score),数据类型:64位整数(int64) - 字段名称:赛后客场比分(post_away_score),数据类型:64位整数(int64) - 字段名称:赛后主场比分(post_home_score),数据类型:64位整数(int64) - 字段名称:赛后打者方比分(post_bat_score),数据类型:64位整数(int64) - 字段名称:赛后防守方比分(post_fld_score),数据类型:64位整数(int64) - 字段名称:内野防守站位(if_fielding_alignment),数据类型:字符串(string) - 字段名称:外野防守站位(of_fielding_alignment),数据类型:字符串(string) - 字段名称:旋转轴(spin_axis),数据类型:双精度浮点数(float64) - 字段名称:主场胜率变化值(delta_home_win_exp),数据类型:双精度浮点数(float64) - 字段名称:跑垒分变化值(delta_run_exp),数据类型:双精度浮点数(float64) - 字段名称:击球速度(bat_speed),数据类型:双精度浮点数(float64) - 字段名称:挥棒长度(swing_length),数据类型:双精度浮点数(float64) - 字段名称:基于击球速度与仰角的预估长打率(estimated_slg_using_speedangle),数据类型:双精度浮点数(float64) - 字段名称:投手造成的跑垒分变化值(delta_pitcher_run_exp),数据类型:双精度浮点数(float64) - 字段名称:极速球速(hyper_speed),数据类型:双精度浮点数(float64) - 字段名称:主场比分差(home_score_diff),数据类型:64位整数(int64) - 字段名称:打者方比分差(bat_score_diff),数据类型:64位整数(int64) - 字段名称:赛前主场胜率(home_win_exp),数据类型:双精度浮点数(float64) - 字段名称:打者方胜率(bat_win_exp),数据类型:双精度浮点数(float64) - 字段名称:投手传统年龄(age_pit_legacy),数据类型:64位整数(int64) - 字段名称:打者传统年龄(age_bat_legacy),数据类型:64位整数(int64) - 字段名称:投手当前年龄(age_pit),数据类型:64位整数(int64) - 字段名称:打者当前年龄(age_bat),数据类型:64位整数(int64) - 字段名称:投手对阵打者轮次(n_thruorder_pitcher),数据类型:64位整数(int64) - 字段名称:打者本场比赛前打数(n_priorpa_thisgame_player_at_bat),数据类型:64位整数(int64) - 字段名称:投手距上一场比赛天数(pitcher_days_since_prev_game),数据类型:双精度浮点数(float64) - 字段名称:打者距上一场比赛天数(batter_days_since_prev_game),数据类型:双精度浮点数(float64) - 字段名称:投手距下一场比赛天数(pitcher_days_until_next_game),数据类型:双精度浮点数(float64) - 字段名称:打者距下一场比赛天数(batter_days_until_next_game),数据类型:双精度浮点数(float64) - 字段名称:考虑重力的Z方向轨迹偏移(api_break_z_with_gravity),数据类型:双精度浮点数(float64) - 字段名称:投手手臂侧X方向轨迹偏移(api_break_x_arm),数据类型:双精度浮点数(float64) - 字段名称:打者方向X方向轨迹偏移(api_break_x_batter_in),数据类型:双精度浮点数(float64) - 字段名称:投球手臂角度(arm_angle),数据类型:双精度浮点数(float64) - 字段名称:挥棒攻击角(attack_angle),数据类型:双精度浮点数(float64) - 字段名称:挥棒攻击方向(attack_direction),数据类型:双精度浮点数(float64) - 字段名称:挥棒路径倾斜度(swing_path_tilt),数据类型:双精度浮点数(float64) - 字段名称:击球时球与打者X方向位置差(英寸)(intercept_ball_minus_batter_pos_x_inches),数据类型:双精度浮点数(float64) - 字段名称:击球时球与打者Y方向位置差(英寸)(intercept_ball_minus_batter_pos_y_inches),数据类型:双精度浮点数(float64) - 字段名称:垂直旋转角度(VRA),数据类型:双精度浮点数(float64) - 字段名称:水平旋转角度(HRA),数据类型:双精度浮点数(float64) - 字段名称:总投球数(PitchesThrown),数据类型:32位整数(int32) - 字段名称:是否为好球(IsStrike),数据类型:32位整数(int32) - 字段名称:是否为滚地球(IsGB),数据类型:双精度浮点数(float64) - 字段名称:打者姓名(BatterName),数据类型:字符串(string) 数据集拆分: - 拆分名称:训练集(train),数据字节数:3043941146,样本数量:2966881 下载大小:441054634,数据集总大小:3043941146 配置信息: - 配置名称:默认配置(default),数据文件: - 拆分:训练集(train),文件路径:data/train-*
提供机构:
jab13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作