five

Given Name Prevalence for Cumulative Gender Analysis

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14025759
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the 15000 most prominent given names from Wikidata and with their calculated prevalent gender based of the genders assigned to the people in Wikidata that have those names. As well as the software components and introduction to produce an updated version of this dataset at a later point. Within the description of the TETTRIs Task 3.2 "Automatic mapping of taxonomic expertise", it is stated that for the various expert groups gender balance should be one of the factors to profile for. Since the analysis on the various groups should be done automatically, it is necessary to estimate the gender balance of a group without manual curation. One approach that we are considering is to do this estimate based on the given names of the identified experts. This repository lays the ground work for such an approach. This is clearly a heuristical approach. The data from this repository is not to be used to assess the gender of any individual, but only to determine the gender balance amongst a group of people with room for statistical errors . We are aware that this approach relies on many oversimplifications as well as biases in the underlying data and some of those biases and oversimplifications are addressed in the file README.md, included in the data set.
创建时间:
2024-11-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作