MDMIC - Indic Cross-Domain, Multi-Intent NLU Dataset

Name: MDMIC - Indic Cross-Domain, Multi-Intent NLU Dataset
Creator: Kathakali Mitra
License: 暂无描述

IEEE2026-04-17 收录

下载链接：

https://ieee-dataport.org/documents/mdmic-indic-cross-domain-multi-intent-nlu-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

We construct an Indic benchmark corpus MDMIC through data augmentation, based on the benchmark multilingual dataset MASSIVE, for NLU tasks, i.e, ID, DC, SF in complex utterances in low-resource languages. These utterances span multiple domains and intents, comprising complex, multi-sentence structures.Each class in this dataset is a combination of multiple intents and domains. 6 dataset files are uplaoded as per the given language <Hindi, Bengali, Tamil, Telugu, Kannada, Malayalam>. It has Complex Sentence (contains the user-utterance) , Intents (refers to the multiple intents in the user-utterance) , Scenarios (refers to the domain information - contains cross domain data) , annot_uts (contains slot\/entity level information in the specified language), num_scenarios (contain the number of domains), annot_utts_eng (contains the slot\/entity level information in English)

提供机构：

Kathakali Mitra

5,000+

优质数据集

54 个

任务类型

进入经典数据集