Karavet/ILUR-news-text-classification-corpus
收藏Hugging Face2022-10-21 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Karavet/ILUR-news-text-classification-corpus
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- hy
task_categories: [news-classification, text-classification]
multilinguality: [monolingual]
task_ids: [news-classification, text-classification]
license:
- apache-2.0
---
## Table of Contents
- [Table of Contents](#table-of-contents)
- [News Texts Dataset](#news-texts-dataset)
## News Texts Dataset
We release a dataset of over 12000 news articles from [iLur.am](http://www.ilur.am/), categorized into 7 classes: sport, politics, weather, economy, accidents, art, society. The articles are split into train (2242k tokens) and test sets (425k tokens).
For more details, refer to the [paper](https://arxiv.org/ftp/arxiv/papers/1906/1906.03134.pdf).
提供机构:
Karavet
原始信息汇总
数据集概述
基本信息
- 语言: 亚美尼亚语 (hy)
- 任务类别: 新闻分类, 文本分类
- 多语言性: 单语种
- 任务ID: 新闻分类, 文本分类
- 许可证: Apache-2.0
数据集详情
- 名称: News Texts Dataset
- 来源: iLur.am
- 规模: 超过12000篇新闻文章
- 分类: 7个类别:体育、政治、天气、经济、事故、艺术、社会
- 数据划分: 训练集(2242k tokens)和测试集(425k tokens)
附加信息
- 详细信息可参考相关论文。



