Hi Jean,
I did a hierarchical SBM on a Twitter follow network (pretty sparse actually in comparison to other kinds of networks) with ca. 250k accounts. It took about "11 days for one run, with 60 vCPUs and a peak usage of ca. 400 GB RAM" (my PhD thesis, https://eprints.qut.edu.au/125543/, p. 251). How useful it is theoretically depends on your goals and discipline. In my case, feel free to read the referenced chapter and decide for yourself.
I also did runs on smaller networks with ca 100k -150k nodes. Running time was still about in the same ball park, or even longer … I guess because the network structure was more complex.
The amount of RAM is pretty mandatory. Swap memory won't help you much, as it slows the algorithm down to a degree that makes the running time infeasible. Number of CPUs could be lower I guess, because much of the algo seemed to run serially and those parts made up most of the calculation time.
I had a university/QRIScloud (https://www.qriscloud.org.au/) provided VM with 60 cores and 900 GB RAM. On a Google VM this would have been pretty costly. I adjusted epsilon for less accuracy and greater speed:
state = gt.inference.minimize_nested_blockmodel_dl(core, verbose=True, mcmc_equilibrate_args={'epsilon': 1e-2}, )
(PhD-code/1000+ nested SBM.ipynb at master · FlxVctr/PhD-code · GitHub)
If I wouldn't have had the computing ressources for free I wouldn't have done it.
I'd recommend to test infomap if you're looking for a more efficient alternative (MapEquation) that also works with entropy minimization (even though it's more flow oriented). Just did a 181k network community detection (non-hierarchical) in a matter of seconds on a last-year's Macbook Pro yesterday. I don't know how long it takes for a hierarchical structure, but it can do this, so it's worth a try.
Also efficient, but with all the drawbacks that Tiago Peixoto elaborates on in his papers (which I also refer to in the chapter linked above), and not hierarchical, is the parallelised modularity maximisation (Parallelised Louvain Method) PLM in NetworKit (https://networkit.github.io/). Despite it's theoretical and statistical drawbacks it delivers good heuristical evidence for communities in networks imho. But that depends a lot on what you want to do
Cheers,
Felix
Dr. Felix Victor Münch Postdoc Researcher
Leibniz Institut for Media Research | Hans-Bredow-Institut (HBI), Hamburg
https://leibniz-hbi.de/
https://felixvictor.net/)
attachment.html (10.1 KB)