ClinSpEn-CT Data: Parallel English-Spanish Biomedical Terminology
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6497372
下载链接
链接失效反馈官方服务:
资源简介:
UPDATE August 22nd 2022: The data in this repository has been merged with the rest of the ClinSpEn data, you may access it here: https://doi.org/10.5281/zenodo.6497350
This repository contains the sample, test and background data for the ClinSpEn-Clinical Terms sub-track. The direction of this sub-track is ES>EN.
ClinSpEn is part of the Biomedical WMT 2022 shared task, having the aim to promote the development and evaluation of machine translation systems adapted to the medical domain with three highly relevant sub-tracks: clinical cases, medical controlled vocabularies/ontologies, and clinical terms and entities extracted from medical content.
The terms were directly extracted from medical literature and clinical records, with particular focus on diseases, symptoms, findings, procedures and professions and translated and revised by professional medical translators.
The sample set contains 7 000 terms as a tab-separated file (TSV), with the first column corresponding to English terms and the second column to Spanish terms.
The test and background data is made up of a TSV file with two columns: term number and Spanish term.
Related Links:
- Sub-track website with more information: https://temu.bsc.es/clinspen/
- WMT website: https://www.statmt.org/wmt22/
- CodaLab: https://codalab.lisn.upsaclay.fr/competitions/6696
- ClinSpEn-CC (Clinical Cases): https://doi.org/10.5281/zenodo.6497350
- ClinSpEn-CT (Clinical Terms): https://doi.org/10.5281/zenodo.6497372
- ClinSpEn-OC (Ontology Concepts): https://doi.org/10.5281/zenodo.6497388
创建时间:
2022-08-22



