DBpedia RDF2Vec Graph Embeddings
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6376305
下载链接
链接失效反馈官方服务:
资源简介:
DBpedia graph embeddings using RDF2Vec. RDF2Vec embedding generation code can be found here and is based on a publication by Portisch et al. [1].
The embeddings dataset consists of 200-dimensional vectors of DBpedia entities (from 1/9/2021).
Generating Embeddings
The code for generating these embeddings can be found here.
Run the run.sh script that wraps all the necessary commmands to generate embeddings
bash run.sh
The script downloads a set of DBpedia files, which are listed in dbpedia_files.txt. It then builds a Docker image and runs a container of that image that generates the embeddings for the DBpedia graph defined by the DBpedia files.
A folder files is created containing all the downloaded DBpedia files, and a folder embeddings/dbpedia is created containing the embeddings in vectors.txt along a set of random walk files.
Run Time of Embeddings Generation
Generating embeddings can take more than a day, but it depends on the number of DBpedia files chosen to be downloaded. Following are some basic run time statistics when embeddings are generated on a 64 GB RAM, 8 cores (AMD EPYC), 1 TB SSD, 1996.221 MHz machine.
Total: 1 day, 8 hours, 52 minutes, 41 seconds
Walk generation: 0 days, 7 minutes, 24 minutes, 36 seconds
Training: 1 day, 1 hour, 28 minutes, 5 seconds
Parameters Used
Here is listed the parameters used to generate the embeddings provided here:
Number of walks per entity: 100
Depth (hops) per walk: 4
Walk generation mode: RANDOM_WALKS_DUPLICATE_FREE
Threads: # of processors / 2
Training mode: sg
Embeddings vector dimension: 200
Minimum word2vec word count: 1
Sample rate: 0.0
Training window size: 5
Training epochs: 5
创建时间:
2022-03-25



