Generating a "sequence" of graphs

Hi all,

Relative graph-tool newbie here. I have a situation which I'm not sure how
to approach in an efficient manner. I was hoping maybe someone could give
some insight here.

Basically, I want to generate an undirected Price network of N nodes, but I
would like to be able to "look back" and access the subgraph when the
network was only "k" nodes large. For simplicity's, say I'm adding one edge
per vertex (i.e. m=1).

One obvious way to do this seems to be to write a generator where, after
adding the k-th vertex, you append a copy of the graph at that time onto
some list. Then at the end you would have a list of length N where each
element is a graph object.

However... this seems kind of like overkill. I was thinking another way to
do this is to generate only one graph, but give each node and edge an "age"
attribute (as in the example in the tutorial). Then you could get the
time-k subgraph by selecting only nodes and edges that have age less than k.

Is there an obvious way to implement this in graph-tool? Specifically, for
each time-k subgraph I'd want to be able to access the degree distribution
at that time, and also the usual properties of the graph such as neighbors
of a given node, etc.

Many thanks,

attachment.html (1.43 KB)

Yes, the "age" is simply the index of the node/edge, since they are
added sequentially. The first vertex added has index 0, the second has
index 1, and so on. To extract the graph at a given age you just
construct a GraphView, i.e.

    u = GraphView(g, vfilt=lambda v: int(v) < 1000)

will give you the graph at time step 1000.


Oh my goodness, that is such an easy solution. Thanks for pointing this out
Tiago. Good design at work! Sorry for such a trivial question.

Hi all, this is a follow-up question to the one posed above. So, ultimately
what I'd like to do in the graph sequence idea above is this:

Assume we are generating an undirected Price network with m=1, so generating
a tree. For each time i in the sequence, I would like to identify the node
which was newly added (which ought to simply be the one with index i, or
maybe i-1) and then extract the degree of the node in the *previous* graph
which the new node connected to. Using the graph idea of Tiago, the code
I've written looks like:

This code seems to work fine, the only problem is that it takes forever.
Fortunately the memory taken up isn't very large as per Tiago's help, but
even when I try to run this on an N=2000 node graph it takes over 30 seconds
(I'm running Ubuntu with an i7 chip and 4g of memory). I would eventually
want to run something like this for graphs of 100,000 nodes or larger.

My questions are: 1) Where is the performance bottleneck coming from? and 2)
Is there an easy way to get around it? Thanks for all your help.

Many thanks,

Your code is inefficient in many ways. In order to understand why, you
have to ask yourself for all statements, how many operations are
necessary for it to be completed. For instance, in your loop statement:

    while i < sum(1 for _ in G.vertices()):

The sum simply counts the number of vertices, but takes O(N) time. You
should simply use G.num_vertices() which is O(1).

When you construct the filter,

    Gcurr = gt.GraphView(G,vfilt=lambda v: int(v) <= i)

The function provided to vfilt is called for each vertex, which is again
O(N). You can improve it by using array expressions:

    mask = G.vertex_index.copy("int")
    mask.a = mask.a <= i
    Gcurr = gt.GraphView(G,vfilt=mask)

This is still O(N), but avoids that a python function gets called for
each vertex.

However, it is possible to remove the O(N) by not depending on the
GraphView altogether. When you do the loop over the neighbours, you can
simply ignore newer vertices as you go along:

    v = G.vertex(i)
    for w in v.out_neighbours():
        if int(w) < i:
        k = len([u for u in w.out_neighbours() if int(u) < i])

Which should do as you want, and is much faster.