How does graph-tool deal with vertices having a degree of 1 (or 0) when

calculating the local clustering coefficient

graph_tool.clustering.local_clustering? Given the definition used this value

would be undefined for k<=1.

They are assumed to be zero.

I have computed the local clustering coefficient for my network and then

created a histogram as follows:

g = gt.load_graph('graph_no_multi.gt')

#"The clustering coefficient is normalized only for _simple_ graphs, with at

#most one edge between nodes."

gt.remove_parallel_edges(g)

#create new property map

clustering = g.new_vertex_property("float")

#calculate clustering coefficient

gt.local_clustering(g,prop=clustering,undirected=False)

#Make propery map internal

g.vp.clust = clustering

#Initiliase dictionary containing list of clustering coefficients for given

degree

clust_k_hist={}

for v in g.vertices():

k = v.out_degree()

c=g.vp.clust[v]

if k in clust_k_hist:

clust_k_hist[k].append(c)

else:

clust_k_hist[k]=[c]

If I now however type `print max(clust_k_hist[0])` I get an answer of `2`

which surprises me slightly (similarly for `k=1` I get a value of `2`).

Firstly I wasn't expecting a clustering coefficient greater than `1` but

also for a degree of `0` I would expect to find only clustering coefficients

of `0`. The documentation states that the outdegree is being used in

calculating the local clustering coefficient so I am using the outdegree for

compiling the histogram.

Have I gone wrong somewhere?

Can you simply print the values given by the following?

cluster = gt.local_clustering(g)

loc_cluster = [c for c in cluster]

print(loc_cluster)

In my opinion, this is the correct way to calculate local clustering

coefficients.

Regards

Snehal

attachment.html (3.12 KB)

Isn't that more or less equivalent to what I am doing other than that it

calcualtes the undirected clustering coefficient?

Is your graph really directed? I.e. what does g.is_directed() say?

Yup, g.is_directed() returns True.

I can't reproduce the problem. Please post an example of a graph that shows

the strange behavior.

I will try my best. Are you aware of an easy way of extracting a relevant

section of the graph? (The whole thing is in the million of vertices so

probably too large to post here.)

The clustering coefficient only depends on the local neighborhood of a node.

So you can just extract a subgraph that contains the node in question and

its neighbors.

OK, will try and find how to do that and get back to you.

Attached is a graph which exhibits the behaviour. graph2.gt

<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4026784/graph2.gt>

Here is a screenshot of what graph-tool returns for me:

<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4026784/Capture.png>

And here is a visualisation of the graph:

<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4026784/clust.png>

Here is a more helpful visualisation with the vertices labelled:

<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4026785/clust.png>

So the issue lies with vertex 0 which has an outdegree of 0 and an indegree

of 3 but a clustering coefficient of 2.

Hmm. I could reproduce this. Looks like gt somehow ignores that out_degree

is 0 and is counting it to be 3 (which is actually in-degree). Then the

maximum number of links among neighbours would be at most 3C2 = 3 but

actually 6 are present. So it simply takes the ration 6/3 = 2. Tiago would

be able to say more about this though.

Best

Snehal

attachment.html (2.17 KB)

This is indeed a bug. It has been fixed in git.

Best,

Tiago

Great; thank you very much! Do you have any idea when it might make its way

into the apt-get repository? Running Ubuntu 14.04 I recall building the

package myself having been a major pain due to Boost.

Hi Tiago,

Could I just briefly follow up on this? I realise this may not be a priority

for you (which is fine) but if you have a rough estimate of when it make its

way into the stable release (days, weeks, months) that would greatly help me

to decide whether to embark on spending a lot of time on trying to compile

the git version it or whether it is just a matter of waiting a few more

days.

Best,

Philipp

It is difficult to make a prediction, so I would rather not make one. If you

need to use it ASAP, you are better off compiling it yourself.