Hi again, 

I have recently noticed the actualization of graphtool and now I am a little bit confused about some changes. Sorry, I know my questions are very basic. I am not familiar with these language and I have some dificulties to get results.

I am running inference algorithms to get the best model using different options of model selection. I want to set pclabel in the inference algorithms because I know a priori my network is bipartite, and next I want to get the description length. Before actualization I did this by this way:

vprop_double = g.new_vertex_property("int") # g is my network
 for i in range(0, 11772):
     vprop_double[g.vertex(i)] = 1
 for i in range(11773, 214221):
     vprop_double[g.vertex(i)] = 2

state = gt.minimize_blockmodel_dl(g, pclabel=True)

state.entropy(dl=True) # I am not sure this is the right way to get the description length.

But now I have some problems. First of all, minimize_blockmodel_dl doesn't have a pclabel argument so I don't know how indicate it in the inference algorithm. I have tried this:

state.pclabel = vprop_double

But I get the same result when I do "state.entropy(dl=True)" as before. Also, I get the same result doing "state.entropy(dl=True)" or "state.entropy()", and I don't understand why neither.

And finally, in NestedBlockState objects I don't know to get description length because entropy hasn't a "dl" argument. In these objects entropy and dl are the same?

In conclusion, I don't know how to set pclabel and to get the description length in hierarchical models, and I am not sure if I am getting it correctly in non-hierarchical ones.

Sorry again for my basic questions but I can't go on because of these problems.

Thank you very much!

Best regards, 




Andrea




2016-05-10 11:41 GMT+02:00 Andrea Briega <annbrial@gmail.com>:
Thank you very much! your answer has been really helpful, now I understand this much better. I'll think about the options you said.

Thanks again,


Andrea

2016-05-09 16:33 GMT+02:00 Andrea Briega <annbrial@gmail.com>:

Dear Dr Peixoto,


I would like to solve some questions I have about inference algorithms for the identification of large-scale network structure via the statistical inference of generative models.

Minimize_blockmodel algorithm takes an hour to finish using my network  with 21000 nodes (like the hierarchical version), and it spends two days and a half with overlap. However, I have run an hierarchical analysis with overlap, and it is still running since 14 days ago. So my first question is: is this time normal, or maybe there is any problem? Do you know how long could it ussually takes?

Secondly, I have repeated some of these analysis with exactly same options but I get different solutions (similar but different), so I wonder if the algorithm is heuristic (I thought it was exact).

My last question question regards bipartite analysis. I have two types of nodes in my network and I wonder if there are any analytical difference when running these algorithms with the bipartite option (clabel=True, and different labels in each group of nodes) or not, because it seems that the program “knows” my network is bipartite in any case. If there are differences between bipartite and “unipartite” analysis (clabel=False), is it possible to compare description length between them to model selection?

Thank you very much for your help!


Best regards,



Andrea