I have a graph that is relatively large (1.6 million vertices, 2.4 million arcs). I extract the largest strong component, but it is still quite large (~400,000 vertices) and while the "analysis" algos work smoothly, when it comes to representing it, it's very sparsely connected, so essentially a huge cloud of points.
So I am trying to extract the second largest strong component, which is way smaller. I do the following:
comp, hist = label_components(g)
idx_second_largest = idx[-2]
idx_vertices_second_largest = find(comp.a == idx_second_largest)
u = GraphView(g, vfilt=idx_vertices_second_largest)
To make things easier, I'll paraphrase the example from the doc.
>>> g2=random_graph(100, lambda: (1,1))
>>> comp,hist=label_components(g2)
>>> print hist
[11 28 41 17 3]
So my second largest strong component should be the one with elements labelled 1.
>>> print comp.a
[0 1 1 2 3 2 2 0 0 2 1 3 3 0 3 1 4 0 2 2 1 3 2 1 2 2 2 2 2 1 2 1 3 1 2 2 1
2 3 2 2 0 2 2 4 2 3 2 2 3 1 1 2 1 1 2 1 1 2 4 2 1 0 1 2 1 1 3 2 2 2 1 3 3
1 2 1 1 0 0 2 2 2 3 3 3 2 2 2 2 1 2 3 0 1 1 2 1 0 3]
Using the code above gives, finally
>>> print idx_vertices_second_largest
[ 1 2 10 15 20 23 29 31 33 36 50 51 53 54 56 57 61 63 65 66
71 74 76 77 90
94 95 97]
But when I use this as u = GraphView(g2, vfilt=idx_vertices_second_largest), I get an error that ends with
File "/usr/local/lib/python2.7/dist-packages/graph_tool/__init__.py", line 527, in __get_set_f_array
a[:] = v
ValueError: operands could not be broadcast together with shapes (100) (4)
So as far as picking up the indices, I seem to be right, but after that, problem.. Where am I being silly? (Coming from classic C, my first reaction is "what should I typecast to?".)
Thanks!
--
J.