Counting the number of leaves at different times

Hi all,

I have a temporal graph G carrying a vertex property map 'time' which is an
integer representing the order in which vertices joined the graph. For each
integer time from 1 until n (n = the final size of the graph) I want to
calculate the number of leaves which exist in the graph up until that time.

The problem is that at different point of times in the graph, vertices which
were previously leaves can be attached to and thus stop being leaves. So my
thought is:

For each time t (running from 1 until n):

1) create a GraphView g of the existing graph up until time t
2) count the number of leaves in g

To do 1) I would just use the property map 'time'<t. For 2), I would just
sum a list comprehension on the map returned by g.degree_property_map().

Is there a more efficient way to do this in graph-tool? I'm still relatively
new to the intricacies of graph-tool, and I can see this taking a while with
large graphs of a couple million nodes.

What do you call a leaf? A node with degree one?

Tiago Peixoto wrote

What do you call a leaf? A node with degree one?

Yes, by "leaf" I mean a node with total degree = 1.

-Jimmy

Ni! Hi gogurt,

Although you don't specify your problem thoroughly, here's a likely
useful observation.

Being a leaf is a local property, it depends only on a vertices' own
connections.

Therefore, if you change the graph, you can easily keep track of how
that change affected the number of leaves.

And so, instead of counting the number of leaves, at each step you can
calculate how your changes affect the number of leaves, and add that to
the previous number.

The difference in efficiency is: Your changes usually affect only few
vertices at a time. Counting the number of leaves runs through all your
vertices each time.

Notice you don't need a GraphView to do this. Just keep a vertex
property with the current degrees as you move through each step,
updating only what gets changed.

Notice also that this points to understanding your problem, not about
graph-tool or any other instrument.

Cheers,

ale
.~ยด

Hi Alex,

Thanks for your input. I see what you mean, but my issue is that I'm viewing
the graph from the end *after it's been created.*

I'm not performing this calculation as the graph is being created. I'm being
handed a graph with the vertex creation times, so I need to step back
through the history of the graph to recover which vertices connected to
which vertices anyway. Won't that essentially also require looking at
GraphViews?

I'm not performing this calculation as the graph is being created. I'm being
handed a graph with the vertex creation times, so I need to step back
through the history of the graph to recover which vertices connected to
which vertices anyway. Won't that essentially also require looking at
GraphViews?

Just by inspecting the time labels at each of the incident edges of a node,
it is trivial to say at which time it became a "leaf" (if ever) and when it
stopped being one, simply by iterating over them a single time. If the graph
just grows (i.e. edges are not deleted) a node is a "leaf" if its first edge
has no other edge with the same time label, and stops being one by the time
the second edge arrives. Just iterating through the nodes and determining
this should be much faster (i.e. O(E)) than the original algorithm you
proposed using GraphViews, which would be O(N^2).

To quote Alexandre:

Notice also that this points to understanding your problem, not about
graph-tool or any other instrument.

Indeed, when you ask

    "Is there a more efficient way to do this in graph-tool?"

it usually amounts basically to

    "Is there a more efficient way to do this?"

which is something you can answer by looking at the problem more closely,
without requiring deep knowledge about any details of the library.

Right. Thanks. I appreciate the solution about iterating over the edges. I
hadn't thought of that because I had forgotten that one could easily extract
the neighbors of a vertex.

I didn't mean to imply that I was just looking for some solution using the
details of the package. I see your point about thinking about the problem,
but also I think that sometimes the two approaches go hand-in-hand. If you
don't really know what the package is capable of, then it obviously limits
your thinking about the problem.

If these sorts of questions don't belong on the mailing list, then I
apologize for wasting your time and can take my questions somewhere else.

It was not meant as a reprimand, I was just making a point. Feel free to
keep posting if you think we can help (and the questions are related to
graph-tool).