Supplementary Material for "Detecting Automatic Software Plagiarism via Token Sequence Normalization"
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10430321
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains additional material supporting the paper titled "Detecting Automatic Software Plagiarism via Token Sequence Normalization", presented at ICSE 2024 (research track).
The paper presents a defense mechanism against automated plagiarism generators utilizing program dependence graphs and demonstrates its effectiveness in countering insertion-based and reordering-based obfuscation attacks.
The defense mechanism was also integrated into the software plagiarism detector JPlag, thus providing a widely accessible solution.
Contents Overview:
Datasets: Two datasets from the PROGpedia collection and two internal datasets. For the latter, only the metadata is available due to the sensitive nature of the data.
Plagiarized Submissions: Generated plagiarism instances illustrating various obfuscation methods such as insertion, reordering, and insert-after-reordering.
Evaluation Data: JSON files detailing calculated similarities and runtime measurements for all datasets.
Source Code: The implementation of our defense mechanism based on the software plagiarism detector JPlag (v4.0.0). Note that JPlag is licensed under the GPL-3.0 license.
Evaluation Code: Python code for runtime measurements.
Interactive Plots: HTML visualizations of the paper's plots, offering dynamic insights into the research findings. particularly focusing on the detection of automatic software plagiarism through token sequence normalization.
Demo: A packaged JAR of our implementation alongside an instruction on how to execute it.
创建时间:
2023-12-29



