Understandability in Decompilation
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10058721
下载链接
链接失效反馈官方服务:
资源简介:
Code Understandability in Java Decompilation
Data and tools of the paper "Demystifying and Assessing Code Understandability in Java Decompilation".
Data
Our data in the directory data/ includes three parts:
Experimental data including original source code and corresponding code decompiled by CFR, Fernflower and Jadx respectively in directory data/code/.
Calculation results in directory data/results/.
The annotated dataset data/relative_understandability.csv denotes the relative understandability of the file decompiled by the decompiler compared to the original file, in which -1, indicating that the decompiled file is less understandable than the original file; 0, signifying equivalent; and 1, indicating more understandable.
Tools
Our tools in the directory tool/ includes tools for assessing the understandability of decompiled code with perplexity, Cognitive Complexity and Cognitive Complexity for Decompilation.
Environment
System: Ubuntu 20.04
Python: python 3.10
pip install kenlm
pip install javalang
Java: JDK >= 11
Perplexity Calculator
perplexity_calculator.py calculates the perplexity of n-gram models for a Java file. 5-gram.binary is our 5-gram language model.
python perplexity_calculator.py <5_gram.binary>
Where <5_gram.binary> represents path to the n-gram model, represents the Java file to be evaluated.
Cognitive Complexity Calculator and Cognitive Complexity for Decompilation Calculator
CognitiveComplexityCalculator-1.0.jar calculates the Cognitive Complexity for Java files. CognitiveComplexityforDecompilationCalculator-1.0.jar calculates the Cognitive Complexity for Decompilation for Java files.
java -jar CognitiveComplexityCalculator-1.0.jar
java -jar CognitiveComplexityforDecompilationCalculator-1.0.jar
Where represents the directory of all Java files to analyze, including all the files in the subdirectories. represents where the output file is created.
The output file is a .csv file which contains the Cognitive Complexity or Cognitive Complexity for Decompilation value for each method. Specifically it contains:
Absolute Module Path: The path of the class containing the method
Module Position: The line in the .java file where the method starts
Module declaration: The method signature and return type or pattern type
Max Nesting: The maximum level of nesting reached by the method (considering as 1 the starting level)
Cognitive Complexity or Cognitive Complexity for Decompilation
Reference
Cognitive Complexity Calculator: https://github.com/BruhZul/cognitive-complexity-calculator
创建时间:
2024-05-14



