five

Lexicon of endometriosis and menopause terms in Spanish and English for fine-tuning language models

收藏
DataCite Commons2025-11-12 更新2025-04-10 收录
下载链接:
https://edatos.consorciomadrono.es/citation?persistentId=doi:10.21950/SMVFJC
下载链接
链接失效反馈
官方服务:
资源简介:
<p>The project, 'DIGITENDER: Extracción terminológica automática y corpora de dominios específicos para la visibilización de los problemas de salud relacionados con la mujer,' aims to develop an automatic term extractor focused on endometriosis and menopause to enhance the visibility of women's health issues that are often overlooked. To achieve this, a language model is trained using annotated terms within their context. This publication is the manually annotated dataset by linguists. This is a lexicon of terms related to endometriosis and another one related to menopause, both available in Spanish and English (four in total). The terms have been obtained from the gold standard of text annotation for these topics and are intended for training a language model to function as a specialized term extractor. The dataset is designed in plain text format but follows a CSV structure. The first column contains the lemmas, and the second one includes the different forms and variants of each lemma.</p>
提供机构:
e-cienciaDatos
创建时间:
2025-03-07
二维码
社区交流群
二维码
科研交流群
商业服务