Ugiat/ner-cat

Name: Ugiat/ner-cat
Creator: Ugiat
Published: 2025-03-19 12:10:24
License: 暂无描述

Hugging Face2025-03-19 更新2025-04-26 收录

下载链接：

https://hf-mirror.com/datasets/Ugiat/ner-cat

下载链接

链接失效反馈

官方服务：

资源简介：

NERCat数据集是一个手动标注的加泰罗尼亚语电视转录数据集，旨在提高加泰罗尼亚语的命名实体识别性能。该数据集包含9,242个句子和13,732个命名实体的标注，涵盖八个类别，包括人名、设施、组织、地点、产品、事件、日期和法律。数据集用于解决加泰罗尼亚语缺乏高质量标注数据的问题，并支持在媒体、治理和文化领域开发自然语言处理应用。数据集的结构与GLiNER框架兼容，并提供了JSON格式的数据实例示例。

The NERCat dataset is a manually annotated collection of Catalan-language television transcriptions designed to improve Named Entity Recognition (NER) performance for the Catalan language. The dataset includes 9,242 sentences and 13,732 named entities annotated across eight categories: Person, Facility, Organization, Location, Product, Event, Date, and Law. It aims to address the lack of high-quality annotated data for Catalan and supports the development of NLP applications in Catalan media, governance, and cultural domains. The dataset structure is compatible with the GLiNER framework, and JSON-formatted instance examples are provided.

提供机构：

Ugiat

5,000+

优质数据集

54 个

任务类型

进入经典数据集