Hello,

I tried graph-tool after some experience with networkx, in search for

better efficiency (attracted by [1]) ... but unfortunately, it seems

like it all depends on how data is stored!

This is true. If your code essentially depends on python loops,

graph-tool will not provide any advantage.

The (several) networks I play with are dense weighted networks of 1435

nodes. They are defined in terms of their adjacency matrices, which are

pandas dataframes stored in a HDF5 store.

In order to get the relative network with networkx, I would do

dg = nx.from_numpy_matrix(hstore['my_df'].as_matrix(),

create_using=nx.DiGraph())

which took around 18 seconds. In graph-tool, the best alternative I

found, based on [2] and [3], is:

# Since the bottleneck is given by the calls to g.add_edge(),

# numba actually slows down things a bit.

#from numba import autojit

#

#@autojit

def graph_from_adj_matrix(m):

g = Graph()

n = m.shape[0]

edges = list(g.add_vertex(n=n))

c = Counter(every=10, until=n)

for i in xrange(n):

for j in xrange(n):

if m[i,j]:

g.add_edge(edges[i], edges[j])

c.count()

return g

g = graph_from_adj_matrix(hstore['my_df'].as_matrix())

... which takes around 200 seconds (and is still not even weighted!).

Is there anything trivial (or simply new) I'm missing (apart from the

possibility, mentioned in [3], of writing my code in C++)?

The ability of inserting many edges without calling add_edge() several

times is indeed lacking...

Because of this, I have now just added a Graph.add_edge_list() to the

git version, which takes a list of edges to be added, which can be a

numpy array. If you have a full adjacency matrix instead of an edge

list, you can do simply:

g.add_edge_list(transpose(nonzero(a)))

This should be much faster than the Python loop above.

Best,

Tiago