Given Name Prevalence for Cumulative Gender Analysis
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14025759
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains the 15000 most prominent given names from Wikidata and with their calculated prevalent gender based of the genders assigned to the people in Wikidata that have those names. As well as the software components and introduction to produce an updated version of this dataset at a later point.
Within the description of the TETTRIs Task 3.2 "Automatic mapping of taxonomic expertise", it is stated that for the various expert groups gender balance should be one of the factors to profile for. Since the analysis on the various groups should be done automatically, it is necessary to estimate the gender balance of a group without manual curation. One approach that we are considering is to do this estimate based on the given names of the identified experts. This repository lays the ground work for such an approach.
This is clearly a heuristical approach. The data from this repository is not to be used to assess the gender of any individual, but only to determine the gender balance amongst a group of people with room for statistical errors .
We are aware that this approach relies on many oversimplifications as well as biases in the underlying data and some of those biases and oversimplifications are addressed in the file README.md, included in the data set.
创建时间:
2024-11-01



