Beyond Code: Is there a difference between comments in visual and textual languages?
收藏DataCite Commons2023-11-24 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/Beyond_Code_Is_there_a_difference_between_comments_in_visual_and_textual_languages_/24631350/1
下载链接
链接失效反馈官方服务:
资源简介:
<b>Context</b>: Code comments are crucial in understanding, maintenance, and extensibility of source code. To further understand the information embedded in comments and how to leverage that information for various tasks, previous work has proposed several taxonomies of information types. However, they are limited to comments of textual languages, especially object-oriented ones. Since Simulink, a visual modeling language, offers to embed visual information and features a variety of ways to comment a model, previous taxonomies are not directly transferrable to it. In this work, we extend a multi-language comment taxonomy onto new types of comments as well as two new languages: Simulink and MATLAB. Furthermore, we give a qualitative and quantitative overview of Simulink commenting practices and compare it to commenting practices in textual languages.<b>Methods</b>: In this study, we analyze a set of 59,267 comments from 2,833 Simulink open-source projects consisting of 9,095 Simulink models and 17,792 MATLAB source code files. We identify various types of comments used in Simulink and MATLAB, how often these types are used, classify their information, and analyze their correlations with model metrics. We also analyze the commenting guidelines of Simulink and MATLAB and explore whether developers follow these guidelines in their comments.<br>We manually analyze 757 comments of this set to extend the multi-language comment taxonomy of textual languages and compare the commenting practices in textual languages and visual languages.<b>Results</b>: We create a taxonomy named SCoT (Simulink Comment Taxonomy) for classifying comments, which contains 25 categories.<br>We find, that Simulink comments show a high degree of duplication and are used at all levels of the subsystem hierarchy of Simulink models, especially at the root level. Of all types of comments, Annotations are used most often, while Notes are hardly used at all. Our results indicate that Simulink developers prefer to add new comments rather than adding to existing ones. Also, they rarely follow the standard guidelines (although there are only few guidelines).<br>Overall, we find that Simulink comment information diversity and distribution is comparable to textual languages.<b>Impact</b>: Our study highlights, how little comment information varies across programming languages. Such findings can help researchers and developers build documentation tools based on commenting practices for polyglot environments.
提供机构:
figshare
创建时间:
2023-11-24



