Understandability in Java Decompilation
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11474285
下载链接
链接失效反馈官方服务:
资源简介:
Data and tools of the paper "Demystifying and Assessing Code Understandability in Java Decompilation".
Data
Our data in the directory `data/` includes two directories `data/original` and `data/testset`, representing the original data set and the test set. Both directories include three parts:1. Experimental data including source code and corresponding code decompiled by CFR, Fernflower and Jadx respectively in directory `code`.2. Calculation results in directory `results`.3. The annotated dataset `relative_understandability_.csv` denotes the relative understandability of the file decompiled by the decompiler compared to the original file, in which -1, indicating that the decompiled file is less understandable than the original file; 0, signifying equivalent; and 1, indicating more understandable.
Tools
Our tools in the directory `tool/` includes tools for assessing the understandability of decompiled code with perplexity, Cognitive Complexity and Cognitive Complexity for Decompilation.
Environment
- System: Ubuntu 20.04- Python: python 3.10 pip install kenlm pip install javalang- Java: JDK >= 11
Perplexity Calculator
`perplexity_calculator.py` calculates the perplexity of n-gram models for a Java file.`5-gram.binary` is our 5-gram language model.
python perplexity_calculator.py <5-gram.binary>
Where <5-gram.binary> represents path to the n-gram model, represents the Java file to be evaluated.
Cognitive Complexity Calculator and Cognitive Complexity for Decompilation Calculator
`CognitiveComplexityCalculator-1.0.jar` calculates the Cognitive Complexity for Java files.`CognitiveComplexityforDecompilationCalculator-1.1.jar` calculates the Cognitive Complexity for Decompilation for Java files.
java -jar CognitiveComplexityCalculator-1.0.jar java -jar CognitiveComplexityforDecCalculator-1.1.jar
Where represents the directory of all Java files to analyze, including all the files in the subdirectories. represents where the output file is created.
The output file is a .csv file which contains the Cognitive Complexity or Cognitive Complexity for Decompilation value for each method. Specifically it contains:
- Absolute Module Path: The path of the class containing the method- Module Position: The line in the .java file where the method starts- Module declaration: The method signature and return type or pattern type- Max Nesting: The maximum level of nesting reached by the method (considering as 1 the starting level)- Cognitive Complexity or Cognitive Complexity for Decompilation
Reference
Cognitive Complexity Calculator: https://github.com/BruhZul/cognitive-complexity-calculator
MetricsReloaded: https://github.com/BasLeijdekkers/MetricsReloaded/
DepDigger: A Tool for Detecting Complex Low-Level Dependencies: https://www.sosy-lab.org/~dbeyer/DepDigger/
创建时间:
2025-02-14



