Run gt.topology.shortest_path() with multiple (not all) sources in parallel

Greetings!

For my simulations, I need to process many shortest paths algorithms on the same Directec Acyclic Graph. I found graph-tool’s shortest_path routine with dag=True to be extremely fast for my use case.

I understand that shortest_distance is compiled to run in parallel when no source is specified, but I don’t need all pairs, only a fraction of them; still, I would like to perform the operation in parallel to speed up as much as possible the computation.

Do you think there is an easy way to achieve this within the library? Thanks for you help!

The function shortest_distance() accepts multiple targets for the same source — and if you reverse the graph you have the opposite.

But if you want parallelism, the best option is probably to use multiprocessing.

Thanks for your quick reply!

Problem is I have multiple (source,target) tuples so I think I really need to compute multiple distance maps either way.

I thought about using Python’s multiprocessing, but as I understand it, data cannot be shared through processes so I would need to replicate my DAG in memory once for each core for maximum parallelization and I would incur in RAM issues.

I think the best way would be to compile an extension to graph_tool that computes multiple distances maps in parallel using OpenMP.

Please do correct me if I’m wrong! As an absolute and utter beginner I really appreciate valuable feedback before committing to a particular approach. Thanks again.

With multiprocessing read-only memory sharing is completely possible, and you would not need to replicate the graph across the different processes, as long as you don’t modify it, since the memory model is “copy on write”.