Simple node access runtime

I am currently implementing an existing networkx implementation in graph-tool
and observed that the new implementation runs slower than the networkx one.
I then found out that a simple node access is already slower. So I have the
exact same undirected graph with 1000 nodes and avg. degree of 2 in nx and
gt, and then run:

def select(g):
    for i in range(1000):
        n = g.vertex(0)

def select_nx(g):
    for i in range(1000):
        n = g.nodes(0)

%time select(gt)
CPU times: user 75.6 ms, sys: 69 µs, total: 75.6 ms
Wall time: 72.6 ms

%time select_nx(g)
CPU times: user 3.82 ms, sys: 0 ns, total: 3.82 ms
Wall time: 3.46 ms

So I am wondering whether this is to be expected and that I can only see the
drastic runtime improvements when working with calculations like shortest
paths etc.

Graph-tool achieves its performance by off-loading central loops and
data structures to C++. If your main loops are in Python, it offers no
advantage whatsoever.

The same applies for things like ndarray in Numpy, etc. The approach to
achieve good performance in Python is to avoid as many loops as
possible.

In fact, as you see above, the fact that it depends on foreign data
structures may even cause some overhead.

(In graph-tool, when you call Graph.vertex() a new object is generated,
unlike in networkx where it just accesses a list. This can be trivially
optimized in graph-tool by either working with 'ints' most of the time,
or caching the vertex objects, i.e. vertices = list(g.vertices()); v =
vertices[i])

Best,
Tiago

Thanks Tiago, that totally makes sense.

I have another question. I am currently running some benchmarks for
comparing two graph-tool installations: 2.2.44 vs. 2.16

While in general 2.16 is much faster on most operations, it is slower on
some standard vertex access operations. For example:

def test():
     g = gt.price_network(10000, 10, directed=False)
     for i in range(10000):
         rnd = g.vertex(randint(0, g.num_vertices()))
%time test()

2.22:
CPU times: user 189 ms, sys: 7.15 ms, total: 197 ms
Wall time: 196 ms

2.16
CPU times: user 841 ms, sys: 776 µs, total: 841 ms
Wall time: 838 ms

Do you have an idea why this might be the case?

Thanks, Philipp

attachment.html (3.31 KB)

This looks like a performance regression. Please just open an issue in
the website, so I can take care of it.

Best,
Tiago