five

DBpedia RDF2Vec Graph Embeddings

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6376305
下载链接
链接失效反馈
官方服务:
资源简介:
DBpedia graph embeddings using RDF2Vec. RDF2Vec embedding generation code can be found here and is based on a publication by Portisch et al. [1]. The embeddings dataset consists of 200-dimensional vectors of DBpedia entities (from 1/9/2021).   Generating Embeddings The code for generating these embeddings can be found here. Run the run.sh script that wraps all the necessary commmands to generate embeddings bash run.sh The script downloads a set of DBpedia files, which are listed in dbpedia_files.txt. It then builds a Docker image and runs a container of that image that generates the embeddings for the DBpedia graph defined by the DBpedia files. A folder files is created containing all the downloaded DBpedia files, and a folder embeddings/dbpedia is created containing the embeddings in vectors.txt along a set of random walk files.   Run Time of Embeddings Generation Generating embeddings can take more than a day, but it depends on the number of DBpedia files chosen to be downloaded. Following are some basic run time statistics when embeddings are generated on a 64 GB RAM, 8 cores (AMD EPYC), 1 TB SSD, 1996.221 MHz machine. Total: 1 day, 8 hours, 52 minutes, 41 seconds Walk generation: 0 days, 7 minutes, 24 minutes, 36 seconds Training: 1 day, 1 hour, 28 minutes, 5 seconds   Parameters Used Here is listed the parameters used to generate the embeddings provided here: Number of walks per entity: 100 Depth (hops) per walk: 4 Walk generation mode: RANDOM_WALKS_DUPLICATE_FREE Threads: # of processors / 2 Training mode: sg Embeddings vector dimension: 200 Minimum word2vec word count: 1 Sample rate: 0.0 Training window size: 5 Training epochs: 5
创建时间:
2022-03-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作