five

Understandability in Java Decompilation

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11474285
下载链接
链接失效反馈
官方服务:
资源简介:
Data and tools of the paper "Demystifying and Assessing Code Understandability in Java Decompilation". Data Our data in the directory `data/` includes two directories `data/original` and `data/testset`, representing the original data set and the test set. Both directories include three parts:1. Experimental data including source code and corresponding code decompiled by CFR, Fernflower and Jadx respectively in directory `code`.2. Calculation results in directory `results`.3. The annotated dataset `relative_understandability_.csv` denotes the relative understandability of the file decompiled by the decompiler compared to the original file, in which -1, indicating that the decompiled file is less understandable than the original file; 0, signifying equivalent; and 1, indicating more understandable. Tools Our tools in the directory `tool/` includes tools for assessing the understandability of decompiled code with perplexity, Cognitive Complexity and Cognitive Complexity for Decompilation. Environment - System: Ubuntu 20.04- Python: python 3.10  pip install kenlm  pip install javalang- Java: JDK >= 11 Perplexity Calculator `perplexity_calculator.py` calculates the perplexity of n-gram models for a Java file.`5-gram.binary` is our 5-gram language model. python perplexity_calculator.py <5-gram.binary> Where <5-gram.binary> represents path to the n-gram model, represents the Java file to be evaluated. Cognitive Complexity Calculator and Cognitive Complexity for Decompilation Calculator `CognitiveComplexityCalculator-1.0.jar` calculates the Cognitive Complexity for Java files.`CognitiveComplexityforDecompilationCalculator-1.1.jar` calculates the Cognitive Complexity for Decompilation for Java files. java -jar CognitiveComplexityCalculator-1.0.jar java -jar CognitiveComplexityforDecCalculator-1.1.jar Where represents the directory of all Java files to analyze, including all the files in the subdirectories. represents where the output file is created. The output file is a .csv file which contains the Cognitive Complexity or Cognitive Complexity for Decompilation value for each method. Specifically it contains: - Absolute Module Path: The path of the class containing the method- Module Position: The line in the .java file where the method starts- Module declaration: The method signature and return type or pattern type- Max Nesting: The maximum level of nesting reached by the method (considering as 1 the starting level)- Cognitive Complexity or Cognitive Complexity for Decompilation Reference Cognitive Complexity Calculator: https://github.com/BruhZul/cognitive-complexity-calculator MetricsReloaded: https://github.com/BasLeijdekkers/MetricsReloaded/ DepDigger: A Tool for Detecting Complex Low-Level Dependencies: https://www.sosy-lab.org/~dbeyer/DepDigger/
创建时间:
2025-02-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作