Dataset for Analysing Model-Driven Engineering Research
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4745146
下载链接
链接失效反馈官方服务:
资源简介:
Dataset for Analysing Model-Driven Engineering Research
Model-driven engineering (MDE), like any other recent discipline, is continuously evolving as new topics, techniques and application areas emerge. The research interests of the community also evolve due to the progressive challenges posed by the industry as it begins to embrace MDE, helping to drive the discipline forward. With this dataset, we analysed the evolution of the MDE discipline, as evidenced by its associated scholarly data, to gain insights into its evolution over the past years, the research landscape of the domain, the topics being researched, its main contributors and application areas, as well as the main trends in the field. We take advantage of Natural Language Processing and Machine Learning techniques to extract information from the MDE research literature in order to conduct a data-driven analysis of the main features of the domain in general, and of the MoDELS conference in particular.
List of Files
mdeo.ttl : Model-Driven Engineering Ontology
dataset.json : Model-Driven Engineering Dataset
MBSE@analysis_on_models.xlsx : Spreadsheet describing the MoDELS conference
MBSE@analysis_all_dataset.xslx : Spreadsheet describing the Model-Driven Engineering domain
README.md : File you are currently reading
Model-Driven Engineering Ontology (mdeo.ttl)
The Model-Driven Engineering Ontology (MDEO) is a taxonomy of research areas focusing on the Model-Driven Engineering field. It includes 91 concepts arranged in a three-layer mono-hierarchic structure with 573 relationships.
At the top level, there are nine broad concepts including "Model Foundations", "Model Quality", "Modeling Languages".
MDEO has been formalized as OWL ontology following Semantic Web standards. Its data model builds on SKOS (Simple Knowledge Organization System) and it includes five semantic relationships:
superTopicOf, which indicates that a topic is a super-area of another one (e.g., "Modeling Languages" is a super-area of "Language definition").
rdfs:label, which provides a human-readable version of a resource’s name.
skos:altLabel, which indicates additional labels (e.g., "abstract syntax" and "model abstract syntax" for "metamodel").
skos:prefLabel, states the main label of concepts when they are associated with multiple labels.
rdf:type, states that a resource is an instance of a topic class.
Model-Driven Engineering Dataset (dataset.json)
This file contains all the papers we gathered to perform this analysis. First, we extracted a dump of MAG via Microsoft Azure Storage and processed using our local Big Data infrastructure.
Next, we selected all papers from journals and conferences having one-hundred percent focus on the field:
Journal of Software and System Modeling (SoSyM),
Model Driven Engineering Languages and Systems (MoDELS),
European Conference on Modelling Foundations and Applications (ECMFA),
International Conference on Model Transformation (ICMT),
Software Language Engineering (SLE),
Model-Driven Engineering and Software Development (MODELSWARD).
Then, we also included papers having in title or abstract the following chunks of text: "domain specific model", "domain specific modeling", "domain specific modelling", "metamodel", "metamodelling", "metamodeling", "model analysis", "model debugging", "model difference", "model differencing", "model evolution", "model execution", "model maintenance", "model merge", "model quality", "model migration", "model synchronization", "model transformation", "model transformations", "model versioning", "model views", "model viewpoint", "model viewpoints", "model weaving", "model testing", "multiview model", "multiview modeling", "multiview modelling", "OCL", "software model", "system model", "SysML", "UML", "view consistency", "viewpoint consistency", "view integration", "viewpoint integration". As a result, we gathered 43,700 papers.
The dataset is a JSON formatted file, containing a dictionary which keys are paper identifiers. Instead, values are also dictionaries describing papers according to several features. Here you can find an instance of paper available in the dataset:
{
"2116587399": {
"citationcount": 32,
"confname": "models 2012",
"references": [197998272, 2013363798, 1523334793, 1577544661, 2109445551, 2120437191, 2087918852, 1533999404, 2054150958, 2122246939, 2399834472, 2013840728, 2026586559, 2155708393, 2026049208, 2974365732],
"year": "2012-01-01",
"topics": ["theoretical computer science", "modeling language", "semantics", "domain specific modeling", "domain model", "programming language", "computer science", "unified modeling language", "domain knowledge", "metamodeling", "abstract syntax"],
"papertitle": "creating visual domain specific modeling languages from end user demonstration",
"confseries": "MoDELS",
"language": ["", "en"],
"abstract": "Domain-Specific Modeling Languages (DSMLs) have received recent interest due to their conciseness and rich expressiveness for modeling a specific domain. However, DSML adoption has several challenges because development of a new DSML requires both domain knowledge and language development expertise (e.g., defining abstract/concrete syntax and specifying semantics). Abstract syntax is generally defined in the form of a metamodel, with semantics associated to the metamodel. Thus, designing a metamodel is a core DSML development activity. Furthermore, DSMLs are often developed incrementally by iterating across complex language development tasks. An iterative and incremental approach is often preferred because the approach encourages end-user involvement to assist with verifying the DSML correctness and feedback on new requirements. However, if there is no tool support, iterative and incremental DSML development can be mundane and error-prone work. To resolve issues related to DSML development, we introduce a new approach to create DSMLs from a set of domain model examples provided by an end-user. The approach focuses on (1) the identification of concrete syntax, (2) inducing abstract syntax in the form of a metamodel, and (3) inferring static semantics from a set of domain model examples. In order to generate a DSML from user-supplied examples, our approach uses graph theory and metamodel design patterns.",
"conferenceseriesid": 1191550517,
"confplace": "Innsbruck/AUSTRIA",
"urls": ["http://yadda.icm.edu.pl/yadda/element/bwmeta1.element.ieee-000006226010", "http://gray.cs.ua.edu/pubs/mise-2012.pdf", "http://ieeexplore.ieee.org/document/6226010/", "https://ieeexplore.ieee.org/document/6226010/"],
"confseriesname": "Model Driven Engineering Languages and Systems",
"id": 2116587399,
"doi": "10.1109/MISE.2012.6226010",
"authors": [{
"country": "United States",
"affiliation": "University of Alabama, Tuscaloosa",
"name": "eugene syriani",
"id": 578966534,
"gridid": "grid.411015.0",
"affiliationid": 17301866,
"order": 3
}, {
"country": "United States",
"affiliation": "University of Alabama, Tuscaloosa",
"name": "jeff gray",
"id": 2155833130,
"gridid": "grid.411015.0",
"affiliationid": 17301866,
"order": 2
}, {
"country": "United States",
"affiliation": "University of Alabama, Tuscaloosa",
"name": "hyun cho",
"id": 2505758318,
"gridid": "grid.411015.0",
"affiliationid": 17301866,
"order": 1
}],
"mbse_syntactic_topics": ["domain-specific modeling language", "concrete syntax", "metamodel", "modeling language"],
"mbse_annotated": true,
"mbse_semantic_topics": ["modeling language", "metamodel", "concrete syntax"],
"mbse_enhanced_topics": ["concrete syntax", "metamodel", "domain-specific modeling language", "modeling language", "model representation", "syntax", "language definition", "model foundations"]
}
}
Spreadsheet describing the MoDELS conference (MBSE@analysis_on_models.xlsx)
This is the dataset describing the MoDELS conference throughout time. This spreadsheet consists of 25 tabs, which can be categorised according to five main categories: i) metrics, ii) publications, iii) NORM-publications, iv) citations, and v) NORM-citations. The publication and citation tabs report the absolute values, whereas the NORM-publication and NORM-citations report the normalised values. Each of these categories consists of 5 different tabs, each representing a class of entities: i) organizations, ii) topics, iii) authors, iv) countries, and v) conferences.
Here is the full list of tabs with their description:
metrics-of-organizations: General metrics of organizations (universities and companies)
metrics-of-topics: General metrics of topics
metrics-of-authors: General metrics of authors
metrics-of-countries: General metrics of countries
metrics-of-conferences: General metrics of conferences
publications-of-organizations: Papers published by organizations over time
publications-of-topics: Papers published in topics over time
publications-of-authors: Papers published by authors over time
publications-of-countries: Papers published by countries over time
publications-of-conferences: Papers published in conferences over time
NORM-publications-organizations: Normalised number of papers published by organizations over time
NORM-publications-topics: Normalised number of papers published in topics over time
NORM-publications-authors: Normalised number of papers published by authors over time
NORM-publications-countries: Normalised number of papers published by countries over time
NORM-publications-conferences: Normalised number of papers published in conferences over time
citations-of-organizations: Citations that organizations received over time
citations-of-topics: Citations that topics received over time
citations-of-authors: Citations that authors received over time
citations-of-countries: Citations that countries received over time
citations-of-conferences: Citations that conferences received over time
NORM-citations-organizations: Normalised citations that organizations received over time
NORM-citations-topics: Normalised citations that topics received over time
NORM-citations-authors: Normalised citations that authors received over time
NORM-citations-countries: Normalised citations that countries received over time
NORM-citations-conferences: Normalised citations that conferences received over time
Spreadsheet describing the Model-Driven Engineering domain (MBSE@analysis_all_dataset.xslx)
This is the dataset describing the Model-Driven Engineering field throughout time, based on more than 40K papers. This spreadsheet consists of 30 tabs, which can be categorised according to five main categories: i) metrics, ii) publications, iii) NORM-publications, iv) citations, and v) NORM-citations. The publication and citation tabs report the absolute values, whereas the NORM-publication and NORM-citations report the normalised values. Each of these categories consists of 6 different tabs, each representing a class of entities: i) organizations, ii) topics, iii) authors, iv) countries, v) conferences, and vi) journals.
Here is the full list of tabs with their description:
metrics-of-organizations: General metrics of organizations (universities and companies)
metrics-of-topics: General metrics of topics
metrics-of-authors: General metrics of authors
metrics-of-countries: General metrics of countries
metrics-of-conferences: General metrics of conferences
metrics-of-journals: General metrics of journals
publications-of-organizations: Papers published by organizations over time
publications-of-topics: Papers published in topics over time
publications-of-authors: Papers published by authors over time
publications-of-countries: Papers published by countries over time
publications-of-conferences: Papers published in conferences over time
publications-of-journals: Papers published in journals over time
NORM-publications-organizations: Normalised number of papers published by organizations over time
NORM-publications-topics: Normalised number of papers published in topics over time
NORM-publications-authors: Normalised number of papers published by authors over time
NORM-publications-countries: Normalised number of papers published by countries over time
NORM-publications-conferences: Normalised number of papers published in conferences over time
NORM-publications-journals: Normalised number of papers published in journals over time
citations-of-organizations: Citations that organizations received over time
citations-of-topics: Citations that topics received over time
citations-of-authors: Citations that authors received over time
citations-of-countries: Citations that countries received over time
citations-of-conferences: Citations that conferences received over time
citations-of-journals: Citations that journals received over time
NORM-citations-organizations: Normalised citations that organizations received over time
NORM-citations-topics: Normalised citations that topics received over time
NORM-citations-authors: Normalised citations that authors received over time
NORM-citations-countries: Normalised citations that countries received over time
NORM-citations-conferences: Normalised citations that conferences received over time
NORM-citations-journals: Normalised citations that journals received over time
创建时间:
2021-05-11



