Details about the get_marginal()

Hi Tiago,

I would like to thank you for the graph tool. I have a question about the details of get_marginal. Here is my code:

bs = []

def collect_partitions(s):
    global bs
    bs.append(s.get_bs())

gt.mcmc_equilibrate(state, force_niter=20, mcmc_args=dict(niter=10),
                                 callback=collect_partitions)

# Disambiguate partitions and obtain marginals
pmode = gt.PartitionModeState(bs, nested=True, converge=True)
pv = pmode.get_marginal(g)

# Get consensus estimate
bsp = pmode.get_max_nested()

state = state.copy(bs=bsp)

In my case, there are 18186 nodes in the graph. The state.print_summary() is:

l: 0, N: 18186, B: 7
l: 1, N: 7, B: 2
l: 2, N: 2, B: 1

So I tried pv[43] and it gives me:
array([ 0, 0, 0, 0, 2, 16, 1], dtype=int32)
Does it mean that, in the mcmc process, the node β€˜43’ appears in the sixth community for 16 times, in the 5th community for 2 times, and in the 7th community for 1 time?

When I try pv[2], it gives me:
array([ 0, 16, 3], dtype=int32)
Does it mean that, in the mcmc process, the node β€˜2’ appears in the 2nd community for 16 times, in the 3rd community for 3 times?

When I try model.state.get_levels()[0].get_blocks()[43], it returns 5. I think it suggests that in the case of get_max_nested(), the node β€˜43’ appears in the 5th community (start counting the community from 0), which is the same as the result of pv[43]. Is the community index in the get_marginal() is the same to the get_blocks() if I use the state by get_max_nested()?

How to get the probability of nodes for higher hierarchy? For example, the node β€˜43’ is in community 5, I was wondering how to get the marginal index of community 5 as a node in the second layer?

Yes. The answer to all your other questions is also yes.

How to get the probability of nodes for higher hierarchy? For example, the node β€˜43’ is in community 5, I was wondering how to get the marginal index of community 5 as a node in the second layer?

The node corresponding to community 5 in the second layer is the node with index 5.

Hi Tiago,

Thank you very much for the quick reply.

In the second layer, the node with index 5 corresponds to the community 5 in the first layer. I was wondering if the node with index 5 in the second layer also has a marginal thing (like the pv vector in the first layer), since in the nested mcmc process, it seems that every layer is experiencing the mcmc process. If so, how to get the pv vector?

Currently it’s not possible to get marginals for the upper levels from PartitionModeState, but you can always project the upper layers into the lowest levels, and construct a new PartitionModeState.

1 Like

Hi Tiago,

Thank you so much for your answers.

I have two more questions:

As I understand, if I want to get the probability of one node in one community, I should run the mcmc sample process (like gt.mcmc_equilibrate(state, force_niter=20000, mcmc_args=dict(niter=10) callback=collect_partitions)) and calculate the probability by pv (like 16/19 for node 43 in community 5 in the example above), right? I was wondering if there is any way that could directly calculate the probability instead of sampling.

In my case, the network is very large, containing around 20 million nodes. So the mcmc sample process takes long time. Is there any function to do distributed calculation or on GPU.

Sorry for keeping disturbing you. It is a great package and I really appreciate your time

No, you have to use MCMC.

I’m afraid not. If there were, I would have described it in the documentation.

Thank you very much! Have a good weekend!

Hi Tiago,

Sorry to disturb you again. I have another question about PartitionModeState(). If I run

bs =

def collect_partitions(s):
global bs
bs.append(s.get_bs())

gt.mcmc_equilibrate(state, force_niter=30, mcmc_args=dict(niter=10), callback=collect_partitions)

pmode1 = gt.PartitionModeState(bs, nested=True, converge=True)
pv1 = pmode1.get_marginal(g)

the communities are relabelled.

If I run it another time, but without the mcmc_equilibrate process and take the list [state] (the list only contains 1 element, state)as input like this:

pmode2 = gt.PartitionModeState([state], nested=True, converge=True)

will pmode2 and pmode1 keep the similar community label? I know that there might be different community numbers in pmode2 and pmode1. I might want to compare the communities in both cases so I think it is necessary to figure out the relabel patterns.

Unfortunately, it’s not possible to provide a solution to your problem or an answer to your question with the information provided.

To enable us to understand the situation, you need to provide all the items below:

  1. Your exact graph-tool version.
  2. Your operating system.
  3. A minimal working example that shows the problem.

Item 3 above is very important! If you provide us only the part of the code that you believe causes the problem, then it is not possible to understand the context that may have contributed to it.

You need to provide us a complete example that runs and reproduces the problem you are observing.

Hi Tiago,

Thank you for your patience and reply. I need to thoroughly think about my problem and organize the code and language. I will present here if there is more issue.

Thank you very much for your time!

Best,
Xiangnan