five

CLARA Knowledge Graph of licensed educational resources

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8403141
下载链接
链接失效反馈
官方服务:
资源简介:
CLARAThis deposit is part of the CLARA project. The CLARA project aims to empower teachers in the task of creating new educational resources. And in particular with the task of handling the licenses of reused educational resources. The present deposit contains the RDF files created using an RDF mapping (RML) and a mapper (Morph-KGC). It also contains the files JSON used as input. The corresponding pipeline can be found on Gitlab. The data used in that pipeline originate from X5GON, a European project aiming to generate and gather open educational resources. Knowledge graph contentThe present Knowledge Graph contains information about 45K Educational Resources (ERs) and 135K subjects (extracted from DBpedia).That information contains  the author, its title and description the license, a URL to the resource itself, the language of the ER, its mimetype, and finally which subject it talks about, and to what extent. That extent is given by two scores: a PageRank score and a Cosinus score. A particularity of the knowledge graph is its heavy use of RDF reification, across large multi-valued properties.Thus four versions of the knowledge graph exist, using Standard reification, Singleton property, Named graphs, and RDF-star. The Knowledge Graph also contains categories originating from DBpedia. They help precise the subjects that are also extracted from DBpedia. The KG.zip files contain five types of files: Authors_[X].nt - Those contain the authors' nodes, their type, and name. ER_[X].nt/nq/ttl - Those contain the ERs and their information using the respective RDF reification model. categories_skos_[X].ttl - Those contain the hierarchy of DBpedia categories. categories_labels.ttl  - This file contains additional information about the categories. categories_article.ttl - This file contains the RDF triples that link the DBpedia subjects to the DBpedia categories.   JSON content The original dataset was cut into multiple JSON files in order to make its processing easier. DBpedia categories were extracted as RDF and aren't present in the JSON files.There are two types of files in the input-json.zip file: authors_[X].json - Which lists the authors names ER_[X].json - Which lists the ERs and their related information.That information contains: their title. their description. their language (and language_detected, only the first one is used in the pipeline here). their license. their mimetype. the authors. the date of creation of the resource. a url linking to the resource itself. the subjects (named concepts) associated with the resource. With the corresponding scores.   If you do use this dataset, you can cite this repository: Kieffer, M., Fakih, G., & Serrano Alvarado, P. (2023). CLARA Knowledge Graph of licensed educational resources [Data set]. Semantics, Leipzig, Germany. Zenodo. https://doi.org/10.5281/zenodo.8403142 Or the corresponding paper Kieffer, M., Fakih, G. & Serrano-Alvarado, P. (2023). Evaluating Reification with Multi-valued Properties in a Knowledge Graph of Licensed Educational Resources. Semantics, Leipzig, Germany.
创建时间:
2023-10-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作