midnightGlow/BanglaSumXL_Categories
收藏Hugging Face2024-12-19 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/midnightGlow/BanglaSumXL_Categories
下载链接
链接失效反馈官方服务:
资源简介:
该数据集经过升级,新增了一个类别列,用于标识文本所属的类别。数据集涵盖了7个类别:国际、州、娱乐、经济、教育和科技,这些类别是通过手动标注的。该数据集对于孟加拉语文本分类等NLP任务非常有用,因为像孟加拉语这样的低资源语言中,很少有合适的文本分类数据集。虽然有一些孟加拉语的情感分类数据集,但很少有关于文本所属类别的数据集。因此,该数据集的引入填补了这一空白。
The dataset has been upgraded with the addition of category information. A new category column has been added to indicate the category to which the text belongs. The dataset has been manually annotated across 7 categories: International, State, Entertainment, Economy, Education, and Technology. This dataset is useful for NLP tasks like Bangla text classification. Due to the scarcity of proper text classification datasets in low-resource languages like Bangla, especially for categorizing text, this dataset is particularly valuable for text classification in Bangla.
提供机构:
midnightGlow



