five

sanganaka/NeCTIS-Dataset

收藏
Hugging Face2025-07-29 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/sanganaka/NeCTIS-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
DepNeCTI-LSTM数据集是一个针对梵语中嵌套化合物类型识别的专用数据集。它包括两个数据集版本:NeCTIS(领域内,散文)和NeCTIS-OOD(领域外,诗歌)。数据集经过精心注释,包括粗粒度和细粒度语义类型注释。粗粒度标注包括四种广泛的化合物类型,而细粒度标注包含86种详细的子类型。数据集的构建得到了DeitY的支持,并经过了多个语言学专家团队的跨机构验证。

The DepNeCTI-LSTM dataset is a specialized dataset for nested compound type identification in Sanskrit. It includes two versions of the dataset: NeCTIS (in-domain, prose) and NeCTIS-OOD (out-of-domain, poetry). The dataset is meticulously annotated with both coarse-grained and fine-grained semantic type annotations. The coarse-grained annotation includes four broad compound types, while the fine-grained annotation comprises 86 detailed sub-types. The construction of the dataset was supported by DeitY and has undergone cross-institutional validation by teams of linguistic experts.
提供机构:
sanganaka
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作