Efficient Enumeration of the Optimal Solutions to the Correlation Clustering problem
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13894063
下载链接
链接失效反馈官方服务:
资源简介:
Warning: the main file is accidentally missing from this version, use v1.0.2 instead.
Description. This is the data used in the experiments presented in the following paper:
N. Arınık, R. Figueiredo, and V. Labatut, “Efficient Enumeration of the Optimal Solutions to the Correlation Clustering problem,” Journal of Global Optimization, vol. 86, pp. 355–391, 2023. DOI: 10.1007/s10898-023-01270-3 ⟨hal-03935831⟩
Source code. The related source code is available on GitHub:
https://github.com/CompNet/Sosocc
https://github.com/CompNet/EnumCC
Citation. If you use these data, please cite the above reference:
@Article{Arinik2021, author = {Arınık, Nejat and Figueiredo, Rosa and Labatut, Vincent}, title = {Efficient Enumeration of the Optimal Solutions to the Correlation Clustering problem}, journal = {Journal of Global Optimization}, year = {2023}, volume = {86}, pages = {355-391}, doi = {10.1007/s10898-023-01270-3},}
Funding. This research benefited from the support of Agorantic FR 3621, as well as the FMJH Program PGMO and from the support to this program from EDF-THALES-ORANGE-CRITEO.
Further details. We describe below the structure of the zip file `article_materials.zip`:
Experiments for Dataset 1
delay_exec_time: all the results and plots regarding the difference of execution times between EnumCC(3) and OneTreeCC() (i.e., EnumCC(3) minus OneTreeCC()), represented on the log-scaled y-axis of the plots. When such difference takes a negative value, this means our proposed method EnumCC(3) runs faster than OneTreeCC().
EnumCC_nb-jumps: all the results regarding the number of jumps related to EnumCC(3), i.e. njump(EnumCC(3))
exec_time: all the results regarding the execution times of EnumCC(3) and OneTreeCC().
nb-sols: all the results regarding the number of optimals solutions based on EnumCC(3). Note that we show the results of OneTreeCC() only for those with n=50, since both methods run out of the time limit of 12h for several networks with n=50.
networks: We generate these complete and incomplete networks through our random signed network generator, which is publicly available online. For complete unweighted signed networks, this model relies on only three parameters: n (number of vertices), l0 (initial number of modules) and qm (proportion of misplaced edges, i.e. edges meant to be frustrated by construction). Moreover, we make the assumption that the proportion of misplaced edges is the same inside and between the modules. When it comes to incomplete unweighted signed networks, we introduce two more parameters, which are the density d of the graph and the proportion q neg of the negative edges. The last parameter qneg allows to control the ratio of positive to negative edges. For complete unweighted signed networks with d = 1, we generate 20 replications for parameter values l0 = 3, n ∈ {32, 36, 40, 45, 50} and qm ∈ {0.1, 0.2, 0.3, 0.4, 0.5, 0.6}. In these networks, the value of qneg with the considered parameters is approximately equal to 0.7. For incomplete unweighted signed networks with d ∈ {0.25, 0.50}, we generate 20 replications for parameter values l0 = 3, n ∈ {32, 36, 40}, qm ∈ {0.1, 0.2, 0.3, 0.4, 0.5, 0.6} and qneg ∈ {0.3, 0.5, 0.7}. In total, we produce 600 and 1,080 instances for complete and incomplete networks, respectively, which makes a total of 1,680 instances.
partitions: folder containing the partitioning results of two methods: EnumCC(3) vs. OneTreeCC(). Note that the results of OneTreeCC() are not shown for space considerations, except for those with n=50.
Results
Experiments for Dataset 2
benchmark netwoks: We generate these complete and incomplete networks through our random signed network generator, which is publicly available online. The optimal solution for a generated network is known by construction. For a given n, d and l0, we first create a perfectly structurally balanced (i.e., internally positive and externally negative) signed network with a built-in module structure. The underlying module structure constitutes the optimal partition. Then, in order to take into account different positive to negative ratio values for internal and external edges we generate several signed networks by perturbing the initial signed network without affecting its underlying optimal partition, thanks to its definition of stability range. We generate signed networks with parameter values n ∈ {30, 40, 50, 60, 70, 90}, d ∈ {0.25, 1.00} and l0 ∈ {2, 4, 6}. In total, we produce 214 and 184 instances for complete and incomplete networks, respectively, which makes a total of 398 instances.
benchmark partitions: folder containing the partitioning results of two methods: CoNS(rmax) with vs. without MVMO pruning, where rmax ∈ {3,4}.
results: two `csv` files containing benchmark results between CoNS(rmax) with vs. without MVMO pruning, where rmax ∈ {3,4}.
创建时间:
2024-10-05



