印尼语对话语料库的语义角色标注数据集
收藏arXiv2018-06-05 更新2024-06-21 收录
下载链接:
https://kata.ai/case-studies/jemma
下载链接
链接失效反馈官方服务:
资源简介:
本数据集名为‘印尼语对话语料库的语义角色标注数据集’,由卡塔人工智能公司创建,专注于印尼语对话领域。数据集包含6057条独特的句子,这些句子均含有谓词。数据来源于用户与虚拟朋友机器人的对话,经过匿名化处理以保护隐私。数据集的创建过程中,由三位具有语言学背景的标注者进行标注,使用了PropBank的语义角色标注框架,并新增了一个角色‘GREET’。该数据集主要应用于语义角色标注(SRL)研究,旨在解决低资源语言在对话领域的SRL问题。
This dataset is named "Semantic Role Labeling Dataset for Indonesian Conversational Corpora", created by Kata AI Company, focusing on the Indonesian conversational domain. It contains 6057 unique sentences, all of which include predicates. The data is collected from conversations between users and virtual friend robots, and has been anonymized to protect user privacy. During its creation, three annotators with linguistic backgrounds performed annotation using the PropBank semantic role labeling framework, with an additional role "GREET" added. This dataset is primarily applied to semantic role labeling (SRL) research, aiming to address SRL-related issues in the conversational domain of low-resource languages.
提供机构:
卡塔人工智能公司
创建时间:
2018-06-05
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



