5210- Geeta.docx

Name: 5210- Geeta.docx
Creator: figshare
Published: 2025-08-06 15:45:46
License: 暂无描述

DataCite Commons2025-08-06 更新2026-02-09 收录

下载链接：

https://figshare.com/articles/dataset/5210-_Geeta_docx/29842058

下载链接

链接失效反馈

官方服务：

资源简介：

As the Internet rapidly evolves and information proliferates exponentially, consumers face data overload and the dilemma of choice. Personalized recommendation systems (RSs) are essential in reducing this burden by assisting users in filtering and then selecting data according to their tastes and needs. These solutions not only boost user experience and satisfaction but also provide chances for businesses and platforms to boost user engagement, sales, and advertising effectiveness. This work looks to gauge and contrast the computational efficiency of different methods— Term Frequency-Inverse Document Frequency (TF-IDF), K-Nearest Neighbours (KNN), Jaccard Similarity (JS), Autoencoder, and Bernoulli Restricted Boltzmann Machine (RBM) —across multiple text processing tasks. It looks to gauge the efficacy of different methods by analysing the time needed for each job, offering insights into their appropriateness for practical text similarity and retrieval applications. The dataset included in the study was simulated to reflect standard text-processing workflows, emphasizing vectorization, similarity computations, and attention mechanisms. Various similarity measures are utilized to provide recommendations. The findings underscore the compromises between the methodologies regarding computing time and precision. TF-IDF surpassed Jaccard in speed, especially in vectorization and similarity score extraction, becoming it the best option for situations requiring rapid processing. The results validate the choice of TF-IDF, KNN, and Jaccard as the principal approaches for similarity computing, elucidating their respective strengths and limits. The novel integration of these methodologies with attention mechanisms and optimizations introduces originality to the domain, providing fresh insights into enhancing similarity calculations for both precision and computing efficiency. These results will facilitate the development of more efficient, scalable, and precise similarity models in diverse natural language processing (NLP) along with machine learning (ML) applications

提供机构：

figshare

创建时间：

2025-08-06

5,000+

优质数据集

54 个

任务类型

进入经典数据集