Yugrathee28/hinglish-dataset-1.4M
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Yugrathee28/hinglish-dataset-1.4M
下载链接
链接失效反馈官方服务:
资源简介:
Hinglish数据集是一个包含1.46百万条Hinglish(印地语和英语混合)评论的工业级NLP数据集,来源于YouTube评论。数据集经过严格的清洗和标注,包含9种意图和10种情感类别,每条评论都有质量评分和标签置信度评分。数据集主要用于研究和评估目的,适用于对话AI、情感分析、低资源NLP研究等场景。
The Hinglish Dataset is an industrial-grade NLP dataset containing 1.46 million+ Hinglish (Hindi + English code-mixed) comments sourced from YouTube. The dataset has been rigorously cleaned and labeled, featuring 9 intent classes and 10 emotion classes, with each comment accompanied by a quality score and label confidence score. It is intended strictly for research and evaluation purposes, ideal for conversational AI, sentiment analysis, low-resource NLP research, and more.
提供机构:
Yugrathee28



