Cerberose/vietnamese-classification-dataset
收藏Hugging Face2026-02-04 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/Cerberose/vietnamese-classification-dataset
下载链接
链接失效反馈官方服务:
资源简介:
越南分类数据集是一个大规模的越南语文本样本语料库,这些文本样本被标注为三个不同类别的分类标签。该数据集旨在支持越南自然语言处理(NLP)的研究与开发,特别是在文本分类、主题建模和监督学习任务中。包含超过1100万个示例,这个数据集提供了训练健壮的数据驱动语言模型和越南语分类系统所需的规模和语言多样性。
The Vietnamese Classification Dataset is a large-scale corpus of Vietnamese text samples annotated with categorical labels across three distinct classes. It was created to support research and development in Vietnamese Natural Language Processing (NLP), particularly in text classification, topic modeling, and supervised learning tasks. With over 11 million examples, this dataset offers the scale and linguistic diversity necessary for training robust, data-driven language models and classification systems for Vietnamese.
提供机构:
Cerberose



