sanganaka/NeCTIS-Dataset

Name: sanganaka/NeCTIS-Dataset
Creator: sanganaka
Published: 2025-07-29 15:43:26
License: 暂无描述

Hugging Face2025-07-29 更新2025-08-09 收录

下载链接：

https://hf-mirror.com/datasets/sanganaka/NeCTIS-Dataset

下载链接

链接失效反馈

官方服务：

资源简介：

DepNeCTI-LSTM数据集是一个针对梵语中嵌套化合物类型识别的专用数据集。它包括两个数据集版本：NeCTIS（领域内，散文）和NeCTIS-OOD（领域外，诗歌）。数据集经过精心注释，包括粗粒度和细粒度语义类型注释。粗粒度标注包括四种广泛的化合物类型，而细粒度标注包含86种详细的子类型。数据集的构建得到了DeitY的支持，并经过了多个语言学专家团队的跨机构验证。

The DepNeCTI-LSTM dataset is a specialized dataset for nested compound type identification in Sanskrit. It includes two versions of the dataset: NeCTIS (in-domain, prose) and NeCTIS-OOD (out-of-domain, poetry). The dataset is meticulously annotated with both coarse-grained and fine-grained semantic type annotations. The coarse-grained annotation includes four broad compound types, while the fine-grained annotation comprises 86 detailed sub-types. The construction of the dataset was supported by DeitY and has undergone cross-institutional validation by teams of linguistic experts.

提供机构：

sanganaka

5,000+

优质数据集

54 个

任务类型

进入经典数据集