> Please read section IX of this book chapter:
Will do. Or, at least, try

> That is exactly what this does, but in a numerically stable manner.
Not seeing how the function you linked is incremental. It seems to require collecting every conditional probability in memory for every vertex pair. I'm collecting for ~6m pairs on my laptop, that would blow my RAM. (If I'm understanding correctly)

> If you don't supply anything, a flat prior will be used, which means the
inference will be guided by the network structure alone.

When I use the conditional probability method it too only uses the network structure, correct?

This smaller graph I'm testing on has 225,150 possible edges. There are 20,911 actual edges. So, there are 204,239 possible additional edges. Finding 13 potential edges out of 204,239 seems extremely off (even using network structure alone; indeed all the other link prediction methods I'm using only use network structure). I could test this by removing 2000 and then seeing how many more it predicts, but I'm fairly certain already it will be similarly low.

In the context of exploratory link prediction, recommender systems for social networks, etc I don't think there is a principled way to give a "deleted fraction". I don't know how many links are "missing" but it's certainly not 13. I assume I could just assert something more aggressive arbitrarily and get a better result. Would it not be more principled to have a single 'non-edge aggressiveness parameter' and then make a uniform prior (hyperprior?) for it over some reasonable range, perhaps that f ratio you mention? Say, uniformly between [0.01, 0.2]. Is that doable with graph-tool?

Thanks for your help, as always

On Wed, Apr 8, 2020 at 4:44 PM Tiago de Paula Peixoto <tiago@skewed.de> wrote:
Am 08.04.20 um 00:23 schrieb Deklan Webster:
> Thanks for the quick reply
>
>> Note that link prediction itself is a valid criterion for model
> selection, so you might just want to focus on that.
>
> Not sure how to interpret this. Are you saying that even if the
> non-Poisson model has lower entropy, the Poisson model may still be
> 'better' for community detection if it's better at link prediction
> (which you suggest is most likely the case)?

Yes, it may have better *evidence* (which corresponds to the sum over
many partitions), i.e. despite not yielding the best description length
(which is the property of a single partition).

Please read section IX of this book chapter:

https://arxiv.org/abs/1705.10225

>> You should use logsumexp():
> Wouldn't that be non-incremental? Wouldn't the incremental version of
> this be summing the exp(a) of the results at each step, and then taking
> log at the end?

That is exactly what this does, but in a numerically stable manner.

> I'm assuming that `u = s.collect_marginal(u)` is taking care of counting
> the non-edge appearances over every sample (and not just the last one).
> I assume if u.edge(x, y) is None that means the non-edge was never
> observed, correct?

Yes.

> Well, I ran it just now on a smaller graph to test. The graph is
> directed with 475 nodes and 20,911 edges. Intuitively I would expect a
> reasonable number of non-edges to be observed given that there are 21k
> edges... maybe 500~ at least. Running that code above I see that
> `non_edge_found_c` is 13. Only 13 non-edges are observed with non-zero
> probability? The largest of those 13 probabilities is 0.07. And,
> edge_not_found_c is 2. How should I interpret this situation where the
> edge is in the original but not in (any of?) the marginals?
>
> Am I doing something wrong here? Do I need to adjust n, x, n_default,
> x_default?

If you actually know the number of edges removed, you can supply this to
the algorithm via fn_params which controls the prior for the missing
edge probabilities, by setting alpha and beta. If you have removed a
fraction f of the edges, you need to choose alpha/(alpha+beta) = f, and
the higher the values of both alpha and beta are, the stronger will be
the role of the prior (which is a beta distribution).

If you don't supply anything, a flat prior will be used, which means the
inference will be guided by the network structure alone.

Best,
Tiago

--
Tiago de Paula Peixoto <tiago@skewed.de>
_______________________________________________
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool