simulated annealing energy function

de1 · August 10, 2014, 10:07pm

Hi,

Tiago,

first of all, let me express my appreciation to your great work - and that
you share it with others!

it somewhat related to issues #62
<https://git.skewed.de/count0/graph-tool/issues/62> and #29
<https://git.skewed.de/count0/graph-tool/issues/29> - which are about
giving more algorithms for community detection - which i think is very
important.

However, I would like to use simulated annealing(SA) algo - with a different
quality(modularity) function, than of Newman. you indeed gave some
"freedom", letting the user to define gamma.However if I want to use another
function for some reason with SA , there is apparently no way to do it.

Looking at the code(graph_community.hh ), I saw that the SA mechanism is not
separated from the energy function itself ( that Newman with gamma=1, you
give here
<https://graph-tool.skewed.de/static/doc/community.html#graph_tool.community.community_structure>
).

I see no reason for this , as SA algo could be used with many energy
functions.
I hope in the future you can fix it, and even maybe somehow give the user an
API to set the energy function(?) from the python code.

Thanks.

tiago · August 11, 2014, 8:58am

Looking at the code(graph_community.hh ), I saw that the SA mechanism is not
separated from the energy function itself ( that Newman with gamma=1, you
give here
<graph_tool.inference — graph-tool 2.58 documentation;
).

I see no reason for this , as SA algo could be used with many energy
functions.

This is true, but the speed of the updates will depend crucially on the
function used. Modularity has the convenient property that the update of
a single node can be done with a complexity O(k), where k is the degree
of the node. This feature is exploited in the code to make it fast, and
and a "general" version of the code would be much slower.

I hope in the future you can fix it, and even maybe somehow give the
user an API to set the energy function(?) from the python code.

Although something like this could be convenient, it would be horribly
slow, besides what I mentioned above, since it would involve a Python
function call for every node update. It would essentially nullify the
advantages of having a C++ code.

Furthermore, I don't really have a desire to support every community
detection method out there. Not only would this be a lot of work, but
also most of these methods are really very bad; and this includes
modularity maximization itself as a matter of fact.

Having said that, I do intend at some point to include some (one or two)
of the most used ones.

However, what I would recommend for most people are the methods based on
statistical inference of generative models, such as the stochastic block
model. This approach lacks most of the problems which plagues modularity
and other ad-hoc methods (statistical significance, lack of 'resolution
limit', etc). The library currently includes vary fast code for
this. The only situation I can imagine where something else should be
used, is when doing a comparison of algorithms. If you just want to
model some network data, this method should be preferred.

Best,
Tiago