five

Supplemental Files for the article "Backtranslation Effects on Static and Contextual Word Embeddings for Topic Classification"

收藏
DataCite Commons2025-06-01 更新2025-01-06 收录
下载链接:
https://figshare.com/articles/dataset/Supplemental_Files_for_the_article_Backtranslation_Effects_on_Static_and_Contextual_Word_Embeddings_for_Topic_Classification_/27156099/1
下载链接
链接失效反馈
官方服务:
资源简介:
This supplementary files were used in a study investigating the impact of backtranslation on topic classification, focusing on two types of word embeddings: static word vectors (FastText) and contextual word embeddings (RoBERTa). The primary aim of the study was to evaluate whether backtranslation enhances classification performance for both embedding types across multiple languages, including Slovak. The dataset includes both original and backtranslated data, which were utilized to train various classifiers such as Logistic Regression, Support Vector Machine (SVM), Random Forest, and RNN-LSTM.

本补充数据集用于一项探究回译(backtranslation)对主题分类影响的研究,该研究聚焦于两类词嵌入(word embeddings):静态词向量(FastText)与上下文词嵌入(RoBERTa)。本研究的核心目标为评估回译是否能提升两类词嵌入在包括斯洛伐克语在内的多语言场景下的分类性能。该数据集包含原始数据与回译数据,用于训练多种分类器,包括逻辑回归(Logistic Regression)、支持向量机(Support Vector Machine,SVM)、随机森林(Random Forest)以及循环神经网络-长短期记忆网络(RNN-LSTM)。
提供机构:
figshare
创建时间:
2024-10-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作