The paper applies multi-objective search based software remodularization to the program Kate, showing how this can improve cohesion and
coupling, and investigating differences between weighted and unweighted approaches and between equal-size and maximising clusters
approaches. It is also investigated the effects of considering omnipresent modules. Overall, it is provided evidence that search based modularization
can benefit Kate developers.
Kate Modularization Datasets
Data Extraction
Kate’s source code is organized in only two folders, src and session, where each folder accommodate some classes. First, the call graph of
each function of Kate was directly extracted from the source code using Doxygen. Then, Doxygen was also used to extract the inheritance graph
between classes. Finally, Kate’s unweighted and weighted MDGs were created from the call and inheritance graphs, where each class is considered
a module, and a function call or inheritance from one class to another represents a dependency between the respective modules. The weight of
an edge in the weighted MDG is considered to be the number of functions calls from one class to another. The clusters are considered to be the
folders the classes are in.
There are usually some modules that have more dependencies than the average. Such modules are called omnipresent because they do not seem to
belong to any particular cluster, but to the system as whole. The omnipresent modules were identified using thresholds. By choosing an
omnipresent threshold o_t = 3 , for example, all modules that have 3 times more dependencies than the average is considered to be omnipresent.
As smaller the threshold, more modules will be identified as omnipresent. Two different thresholds were used in this work, o_t = 3 and
o_t = 2 . A threshold o t = 4 did not identified any omnipresent module.