proxectonos/Galician_NER
收藏Hugging Face2025-12-17 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/proxectonos/Galician_NER
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个加利西亚语命名实体识别测试集,由四个加利西亚语数据集合并而成,按照新的NER标注标准进行标注。数据集包括corNER、LREC、PUD和TreeGal,数据来源于加利西亚语新闻语料的随机抽取,并手动标注。标注标签用于标识专有名词的初始位置(B)、内部位置(I)或其他语法元素(O)。专有名词根据enamex标准分为PER(人)、ORG(组织)、LOC(地点)和MISC(其他)。该数据集由西班牙数字转型和公共职能部及欧盟NextGenerationEU计划资助。
Dataset created by combining four galician datasets for Named Entity Recognition, annotated according to the new standards for NER annotations: corNER, LREC, PUD, and TreeGal. All datasets where manually annotated from random extractions of journalistic galician corpus. The labels identify their corresponding tokens as: proper nouns in an initial position (B), an internal position (I), or another grammatical element (O). Proper nouns are classified according to the enamex standard notation: PER (person), ORG (organization), LOC (location), MISC (other). This work is funded by the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the project Desarrollo de Modelos ALIA.
提供机构:
proxectonos



