five

Evaluation of the preprocessing and training stages in text classification algorithms in the context of information retrieval

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Evaluation_of_the_preprocessing_and_training_stages_in_text_classification_algorithms_in_the_context_of_information_retrieval/8162216
下载链接
链接失效反馈
官方服务:
资源简介:
Abstract The amount of unstructured data grows with the popularization of the Internet. Texts in natural language represent a relevant and significant set for the analysis and production of knowledge. This work proposes a quantitative analysis of the preprocessing and training stages of a text classifier, which uses as an attribute the feelings expressed by the users. Artificial Neural Network, as a classifier algorithm, and texts from Amazon, IMDB and Yelp sites were used for the experiments. The database allows the analysis of the expression of positive and negative feelings of the users in evaluations of products and services in unstructured texts. Two distinct processes of preprocessing and different training of the Artificial Neural Networks were carried out to classify the textual set. The results quantitatively confirm the importance of the preprocessing and training stages of the classifier, highlighting the importance of the vocabulary selected for the text representation and classification. The available classification techniques achieve satisfactory results. However, even by using two distinct processes of preprocessing and identifying the best training process, it was not possible to totally eliminate the learning difficulties and understanding of the model for the classifications of feelings that involved subjective characteristics of the expression of human feeling.
创建时间:
2019-03-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作