five

Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919)

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7887304
下载链接
链接失效反馈
官方服务:
资源简介:
Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters: sg = True min_count = 5 window = 5 vector_size = 100 epochs = 5 The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary.  See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project website (Living with Machines): https://livingwithmachines.ac.uk/
创建时间:
2023-05-24
二维码
社区交流群
二维码
科研交流群
商业服务