I am trying to find the average shortest path length for my network. I am
currently trying to find it by running dist = gt.shortest_distance(g) and
then finding the average of the results. I however find that for a lot of
distances I get the value "2147483647" which is seemingly imposisbly large
for my network and crops up in the example results in the documentation too
(I only have 718 vertices and 979 edges in my graph). Does this value have a
special significance? Does it mean there is no path between the given
vertices?
Also, seeing as there is no reference to it in the documentation I presume
that this algorithm has not been parallelised?
The value 2147483647 is the maximum possible 32 bits integer. Then I
think it is used as Infinity (since there is no greater value),
meaning that there is no path between two vertices.
you can see the value also in the example in the documentation:
You can get a 2D array from a vector-valued property map with get_2d_array(). The following two computations are equivalent:
g = extract_largest_component(GraphView(collection.data["polblogs"], directed=False), prune=True)
dist = gt.shortest_distance(g)
N = g.num_vertices()
print("average distance:", mean([dist[v].a.sum()/(N-1) for v in g.vertices()]))
d = dist.get_2d_array(arange(N))
print("average distance:", d.sum() / (N*(N-1)))
But note that the first computation is actually faster than obtaining the 2D array and then summing all the elements, despite the python loop!