KIND

Name: KIND
Creator: Bruno Kessler基金会 – Via Sommarive 18, Trento, Italy
Published: 2022-06-14 16:03:48
License: 暂无描述

arXiv2022-06-14 更新2024-06-21 收录

下载链接：

https://github.com/dhfbk/KIND

下载链接

链接失效反馈

官方服务：

资源简介：

KIND是一个意大利多领域命名实体识别数据集，由Bruno Kessler基金会创建。该数据集包含超过一百万个标记，其中约60万标记为人工黄金标注，涵盖新闻、文学和政治演讲三个领域。KIND的主要优势在于其多领域特性，覆盖不同风格和语言使用，是目前最大的意大利语命名实体识别数据集。数据集文本和标注可自由从GitHub仓库下载，适用于意大利语命名实体识别系统的训练。

KIND is an Italian multi-domain named entity recognition (NER) dataset created by the Bruno Kessler Foundation. It comprises over one million annotated tokens, among which approximately 600,000 are manually gold-standard annotated. The dataset covers three domains: news, literature, and political speeches. The core advantage of KIND lies in its multi-domain characteristics, covering diverse writing styles and language usage patterns, making it the largest Italian NER dataset currently available. The dataset texts and annotations can be freely downloaded from its GitHub repository, and it is suitable for training Italian NER systems.

提供机构：

Bruno Kessler基金会 – Via Sommarive 18, Trento, Italy

创建时间：

2021-12-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集