I use graph_tool.topology.shortest_distance for an all pairs shortest path calculation, what is the main run-time footprint of my algorithm and way to large.
How would you speed it up / tackle this?
I tried to sub-sample with manual source, target pairs but that's terribly inefficient since it does not use the internal bookkeeping.
Nice would be graph_tool.topology.shortest_distance(G, U, V), where U and V are lists of same length with sources / targets but that's not implemented.
I use graph_tool.topology.shortest_distance for an all pairs shortest
path calculation, what is the main run-time footprint of my algorithm
and way to large.
How would you speed it up / tackle this?
If I knew of a general faster way to solve the all-pairs shortest path
problem, I would have implemented it.
I tried to sub-sample with manual source, target pairs but that's
terribly inefficient since it does not use the internal bookkeeping.
Nice would be graph_tool.topology.shortest_distance(G, U, V), where U
and V are lists of same length with sources / targets but that's not
implemented.
It is possible to pass a single source but multiple targets.
Subsampling is usually a good technique to reduce the computation time,
but it is hard to know what is applicable to you.
On my test graph with 4201 nodes and 9683 edges I already tried a quick test with both:
I need ~ 70 random (source, target) node pairs to approximate the real mean of all pairs shortest path with an error of about 1 %.
This is about 40 % of the run-time of all pairs shortest path.
Using single source all targets shortest path I need to sample at least 6 random sources to approximate the real mean of all pairs shortest path with an error of about 1 %.
This is about 45-50 % of the run-time of all pairs shortest path.
So, although single source all targets shortest path is waay more efficient in it's computation than shortest_path manually it apparently is not a good candidate for subsampling and effectively doing worse.