efficient random sampling of paths between two nodes

Franco_Peschiera1 · March 23, 2020, 9:14pm

Hello Tiago,

First of all, thanks for your time.

I see what you mean by having a biased logic that would prefer shorter
paths to longer ones, I had not thought about that.

Regarding the self-reference part, I think it would not be a problem
because of the structure of my particular (directed) graph. In fact, each
node represents an assignment *at some given time period* and the outward
neighbors of a node represent assignments *in the future*. In this way, a
path can never visit a previously visited node since there are no possible
cycles. In fact I can easily calculate the shortest and longest possible
path between two nodes (shortest: using graphql's `shortest_distance`
method, longest= number of periods in between the two nodes).

So the paths I want to create (or sample) are just the different ways one
can go from a node N1 (in period P1) to node N2 (in period P2 > P1).
I think that in my graph I could just sample neighbors with a weight that
depends on how far they are (in number of periods) from the node: the
farthest neighbor will have the least probability of being chosen. This
way, I'd compensate the fact that shorter paths take less hops.

What do you think?

regards,

Franco

attachment.html (3.75 KB)

tiago · March 23, 2020, 9:37pm

Hello Tiago,

First of all, thanks for your time.

I see what you mean by having a biased logic that would prefer shorter
paths to longer ones, I had not thought about that.

Regarding the self-reference part, I think it would not be a problem
because of the structure of my particular (directed) graph. In fact,
each node represents an assignment *at some given time period* and the
outward neighbors of a node represent assignments *in the future*. In
this way, a path can never visit a previously visited node since there
are no possible cycles. In fact I can easily calculate the shortest and
longest possible path between two nodes (shortest: using graphql's
`shortest_distance` method, longest= number of periods in between the
two nodes).

Well, for DAG (directed acyclic graphs) the situation is quite
different, you should have said so in the beginning.

So the paths I want to create (or sample) are just the different ways
one can go from a node N1 (in period P1) to node N2 (in period P2 > P1).
I think that in my graph I could just sample neighbors with a weight
that depends on how far they are (in number of periods) from the node:
the farthest neighbor will have the least probability of being chosen.
This way, I'd compensate the fact that shorter paths take less hops.

What do you think?

Why do I get the impression I'm using google more than you to answer
your question?

Here is an approach using rejection sampling:

Another approach is to count the number of paths that go through each
node (this is feasible for DAGs) and use this to sample directly, see:

https://pdfs.semanticscholar.org/0d74/e82c41124f83c842d5432abcb914ed1f410f.pdf

Best,
Tiago