I use graph_tool.topology.shortest_distance for an all pairs shortest path calculation, what is the main run-time footprint of my algorithm and way to large.

How would you speed it up / tackle this?

I tried to sub-sample with manual source, target pairs but that's terribly inefficient since it does not use the internal bookkeeping.

Nice would be graph_tool.topology.shortest_distance(G, U, V), where U and V are lists of same length with sources / targets but that's not implemented.

I use graph_tool.topology.shortest_distance for an all pairs shortest
path calculation, what is the main run-time footprint of my algorithm
and way to large.

How would you speed it up / tackle this?

If I knew of a general faster way to solve the all-pairs shortest path
problem, I would have implemented it.

I tried to sub-sample with manual source, target pairs but that's
terribly inefficient since it does not use the internal bookkeeping.

Nice would be graph_tool.topology.shortest_distance(G, U, V), where U
and V are lists of same length with sources / targets but that's not
implemented.

It is possible to pass a single source but multiple targets.

Subsampling is usually a good technique to reduce the computation time,
but it is hard to know what is applicable to you.

On my test graph with 4201 nodes and 9683 edges I already tried a quick test with both:

I need ~ 70 random (source, target) node pairs to approximate the real mean of all pairs shortest path with an error of about 1 %.

This is about 40 % of the run-time of all pairs shortest path.

Using single source all targets shortest path I need to sample at least 6 random sources to approximate the real mean of all pairs shortest path with an error of about 1 %.

This is about 45-50 % of the run-time of all pairs shortest path.

So, although single source all targets shortest path is waay more efficient in it's computation than shortest_path manually it apparently is not a good candidate for subsampling and effectively doing worse.