ITRA-NATIVE v1.0.0: A gender-disaggregated dataset of trail running participation across 94 countries (2003-2025)
收藏DataCite Commons2026-05-03 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19997352
下载链接
链接失效反馈官方服务:
资源简介:
ITRA-NATIVE (International Trail Running Association - Natured Aggregated Trail Index for Equity) is a gender-disaggregated research dataset derived from the ITRA public registry. It provides edition-level data on 14,801 trail running race editions across 94 countries (2003-2025), enabling quantitative analysis of gendered participation patterns in endurance sport at a global scale. The dataset was constructed through a six-layer computational pipeline (extraction, NLP, clustering, modelling, geomatics, visualisation) developed within the TRAILGENDER research programme (CNRS, UMR ESO 6590).
The dataset addresses a critical gap in sports science. A recent systematic scoping review (Espasa-Labrador et al., 2026) identified only 22 published studies specifically examining female trail running, with the vast majority focused on biomedical variables (physiology, nutrition, injuries) and none adopting a sociological or spatial perspective. ITRA-NATIVE fills this gap by providing the first large-scale, openly available, gender-disaggregated infrastructure for studying women's participation in trail running from social science, geographic, and computational perspectives.
The dataset includes: gender participation ratios (percentage of women finishers per edition), gendered performance gaps (median, winner, and top-10 finish time differentials), course characteristics (distance, elevation gain, technicality index), geographic coordinates and country-level aggregations, temporal trends spanning two decades, a four-cluster race typology derived from Gaussian Mixture Modelling and Ward hierarchical clustering, and event survival data tracking 21,177 race events. All variables are systematically disaggregated by sex, enabling five priority analytical axes: (1) ecology of gendered participation, (2) the ultra-distance paradox and female self-selection, (3) structural exclusion thresholds, (4) elite visibility and podium representation, and (5) temporal recomposition of gender gaps.
The sex variable used in this dataset refers to the administrative sex category recorded by race organisers through the ITRA registry. It operates as a binary classification (M/F) reflecting registration categories, not an ontological claim about gender identity. The dataset does not capture non-binary, transgender, or intersex participation, which constitutes a structural limitation inherited from the source data.
ITRA-NATIVE is distributed as a ZIP archive containing 16 CSV files, 60 PNG visualisations, and 51 analytical outputs. It is designed to be FAIR-compliant (Findable, Accessible, Interoperable, Reusable) and follows open science principles. The dataset is released under CC-BY 4.0 International licence.
Companion to: Plard, M. (2026). ITRA-NATIVE: a global gender-disaggregated dataset of trail running participation (2003-2025). Scientific Data [submitted].
ITRA-NATIVE(国际越野跑协会-公平性自然聚合越野指数,International Trail Running Association - Natured Aggregated Trail Index for Equity)是一项按性别拆分的研究数据集,源自ITRA(International Trail Running Association,国际越野跑协会)公开登记系统。该数据集涵盖2003年至2025年间,全球94个国家共14801场越野跑赛事的赛事级数据,可支撑全球范围内耐力运动性别参与模式的定量分析。本数据集由TRAILGENDER研究项目(法国国家科学研究中心(CNRS),UMR ESO 6590)开发的六层计算流程(数据提取、自然语言处理(Natural Language Processing,简称NLP)、聚类、建模、地理信息学、可视化)构建而成。
该数据集填补了运动科学领域的关键研究空白。近期一项系统范围综述(Espasa-Labrador等,2026)显示,目前仅有22项专门针对女子越野跑的已发表研究,且绝大多数聚焦于生物医学变量(生理学、营养学、损伤学),无一项采用社会学或空间研究视角。ITRA-NATIVE填补了这一空白,提供了首个大规模、可公开获取的按性别拆分的研究基础设施,用于从社会科学、地理学与计算科学视角探究女子越野跑的参与情况。
本数据集包含以下内容:赛事性别参与率(单场赛事完赛女性占比)、性别绩效差距(完赛时间中位数、冠军及前十选手的完赛时间差值)、赛道特征(距离、爬升高度、技术难度指数)、地理坐标与国家级聚合数据、跨越二十年的时间趋势、基于高斯混合模型(Gaussian Mixture Modelling)与沃德分层聚类(Ward hierarchical clustering)得到的四类赛事类型,以及追踪21177场赛事的赛事存活数据。所有变量均按性别进行系统拆分,可支撑五大优先分析维度:(1)性别参与生态;(2)超距离悖论与女性自我选择;(3)结构性排斥阈值;(4)精英选手曝光度与领奖台代表权;(5)性别差距的时间重构。
本数据集使用的性别变量指赛事主办方通过ITRA登记系统记录的行政性别类别,采用二元分类(男/女,对应原登记类别M/F),并非对性别身份的本体论界定。该数据集未纳入非二元性别、跨性别或间性群体的参与数据,这是源自源数据的结构性局限。
ITRA-NATIVE以ZIP压缩包形式分发,内含16个CSV(Comma-Separated Values,逗号分隔值)文件、60张PNG可视化图表与51份分析输出文件。该数据集遵循FAIR(Findable, Accessible, Interoperable, Reusable,可发现性、可访问性、互操作性、可复用性)原则,符合开放科学规范,并采用CC BY 4.0国际许可协议发布。
配套文献:Plard, M.(2026). ITRA-NATIVE:2003-2025年全球越野跑参与情况按性别拆分数据集. 《科学数据》[已投稿]。
提供机构:
Zenodo
创建时间:
2026-05-03



