Classifying Gender Biased Language in University of Edinburgh Heritage Collections Archival Metadata Descriptions
收藏Scottish Government Open Data Portal2023-11-10 更新2026-03-28 收录
下载链接:
https://doi.org/10.7488/ds/7539
下载链接
链接失效反馈官方服务:
资源简介:
These datasets were used to create discriminative text classification models to identify potentially gender biased language. There are datasets for three types of classification models: multilabel document classifiers, multiclass sequence classifiers, and multilabel token classifiers. The data source is the Archives catalog of the University of Edinburgh's Heritage Collections. The archival metadata descriptions extracted from the catalog were labeled according to the Taxonomy of Gendered and Gender Biased Language (published in Havens et al., 2021, linked to as a related paper). Details of the datasets' creation and contents are documented in the Ph.D. thesis by Lucy Havens titled, 'Recalibrating Machine Learning for Social Biases: Demonstrating a New Methodology through a Case Study Classifying Gender Biases in Archival Documentation,' as well as the related papers and GitHub repositories linked to this record.
创建时间:
2023-11-10



