five

Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7181681
下载链接
链接失效反馈
官方服务:
资源简介:
Word vectors related to the paper Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters: sg = True min_count = 1 window = 3 vector_size = 200 epochs = 5 The embeddings are divided into periods of ten years each, with the vectors from each decade aligned to the ones from the most recent decade (1910s) using Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project webpage (Living with Machines): https://livingwithmachines.ac.uk/
创建时间:
2023-05-24
二维码
社区交流群
二维码
科研交流群
商业服务