efficient random sampling of paths between two nodes

Franco_Peschiera1 · April 15, 2020, 8:42am

Hello again Tiago,

First of all thanks for the resources and help. As you may expect, I
somewhat lack a specialized background in graph theory and so I sometimes
miss the correct terminology.

As a reminder, this is a DAG. I haven’t, yet, implemented the random
unbiased generator version of all_paths you kindly shared in those links,
although I have it listed as a possible future step.

As a temporary workaround, I’ve been doing the following each time I want
to randomly sample paths between two nodes:

   1. I calculate the min and max length of paths between the two nodes
   (the min using topology.shortest_distance, a good aproximation of the max
   is trivial).
   2. I sample a number between those min and max`.
   3. I execute topology.all_paths with that number as cutoff argument to
   obtain the paths generator.
   4. I then execute some sampling from that generator, with an iteration
   limit.

This is, of course, just a very crude and biased way of sampling, but at
least it returns a different set of paths at each time it is executed.
Until now, I’ve been assuming each edge has a weight of 1.

I now would like to test giving edges a weight in order for that cutoff
argument to use it. I know the shortest_distance function accepts weights
of edges in order to do Dijkstra. Could the same be done with all_paths
such that the search is stopped if an accumulated weighted-distance is
reached? Is there any (alternative) way of controlling which paths I get
from the all_paths besides that cutoff argument? Or whatever specialized
logic regarding path sampling / filtering I would have to implemented
myself (just like the examples you shared)? Would this be something you
would consider adding to graph-tool?

For example, I’ve been even wondering if I could just create a temporal
view from the graph, by a randomly filtering edges or nodes before samplinh
the paths, so as to, again, reduce the graph I will be sampling at each
iteration.

as always thanks for your time and help!

Franco Peschiera

Message: 2

attachment.html (17.2 KB)

tiago · April 15, 2020, 8:59am

I now would like to test giving edges a weight in order for that
>cutoff> argument to use it. I know the |shortest_distance| function
accepts weights of edges in order to do Dijkstra. Could the same be done
with |all_paths| such that the search is stopped if an accumulated
weighted-distance is reached?

I'm afraid not.

Is there any (alternative) way of
controlling which paths I get from the |all_paths| besides that |cutoff|
argument?

There is no other argument to the function, which is deterministic.

You could randomize the order of the edges in the graph (by removing and
adding them again in a random order), which will change which paths are
shown first, but would not give you a proper unbiased sampling.

Or whatever specialized logic regarding path sampling /
filtering I would have to implemented myself (just like the examples you
shared)? Would this be something you would consider adding to graph-tool?

I think adding an algorithm to perform uniform sampling of paths in DAGs
would be useful, but I can't promise I will find the time anytime soon
to implement one.

Best,
Tiago