For the management of a virtual P2P supercomputer one is interested in subgroups
of processors that can communicate with each other efficiently. The task of
finding these subgroups can be formulated as a graph clustering problem,
where clusters are vertex subsets that are densely connected within themselves,
but sparsely connected to each other. Due to resource constraints, clustering
using global knowledge (i.e. knowing (nearly) the whole input graph) might not be
permissible in a P2P scenario, e.g. because collecting the data is not possible
or would consume a high amount of resources. That is why we present a distributed
heuristic using only limited local knowledge for clustering static and dynamic
graphs.
Based on disturbed diffusion, our algorithm DiDiC implicitly optimizes cut-related
quality measures such as modularity. It thus settles between distributed clustering
algorithms for other quality measures (e.g. energy efficiency in the field of
ad-hoc-networking) and graph clustering algorithms optimizing cut-related measures
with global knowledge. Our experiments with graphs resembling a virtual P2P
supercomputer show the promising potential of our new approach: Although each node
starts with a random cluster number, may communicate only with its direct neighbors
within the graph, and requires only a small amount of additional memory space, the
solutions computed by DiDiC converge to clusterings that are comparable in quality
to those computed by the established non-distributed graph clustering library mcl,
whose main algorithm uses global knowledge.