Local Clustering Coefficient for k<=1

How does graph-tool deal with vertices having a degree of 1 (or 0) when
calculating the local clustering coefficient
graph_tool.clustering.local_clustering? Given the definition used this value
would be undefined for k<=1.

They are assumed to be zero.

I have computed the local clustering coefficient for my network and then
created a histogram as follows:

g = gt.load_graph('graph_no_multi.gt')

#"The clustering coefficient is normalized only for _simple_ graphs, with at
#most one edge between nodes."
gt.remove_parallel_edges(g)

#create new property map
clustering = g.new_vertex_property("float")

#calculate clustering coefficient
gt.local_clustering(g,prop=clustering,undirected=False)

#Make propery map internal
g.vp.clust = clustering
#Initiliase dictionary containing list of clustering coefficients for given
degree
clust_k_hist={}
for v in g.vertices():
    k = v.out_degree()
    c=g.vp.clust[v]
    if k in clust_k_hist:
        clust_k_hist[k].append(c)
    else:
        clust_k_hist[k]=[c]

If I now however type `print max(clust_k_hist[0])` I get an answer of `2`
which surprises me slightly (similarly for `k=1` I get a value of `2`).
Firstly I wasn't expecting a clustering coefficient greater than `1` but
also for a degree of `0` I would expect to find only clustering coefficients
of `0`. The documentation states that the outdegree is being used in
calculating the local clustering coefficient so I am using the outdegree for
compiling the histogram.

Have I gone wrong somewhere?

Can you simply print the values given by the following?

cluster = gt.local_clustering(g)
loc_cluster = [c for c in cluster]
print(loc_cluster)

In my opinion, this is the correct way to calculate local clustering
coefficients.

Regards
Snehal

attachment.html (3.12 KB)

Isn't that more or less equivalent to what I am doing other than that it
calcualtes the undirected clustering coefficient?

Is your graph really directed? I.e. what does g.is_directed() say?

Yup, g.is_directed() returns True.

I can't reproduce the problem. Please post an example of a graph that shows
the strange behavior.

I will try my best. Are you aware of an easy way of extracting a relevant
section of the graph? (The whole thing is in the million of vertices so
probably too large to post here.)

The clustering coefficient only depends on the local neighborhood of a node.
So you can just extract a subgraph that contains the node in question and
its neighbors.

OK, will try and find how to do that and get back to you.

Attached is a graph which exhibits the behaviour. graph2.gt
<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4026784/graph2.gt&gt;

Here is a screenshot of what graph-tool returns for me:
<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4026784/Capture.png&gt;

And here is a visualisation of the graph:
<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4026784/clust.png&gt;

Here is a more helpful visualisation with the vertices labelled:
<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/file/n4026785/clust.png&gt;

So the issue lies with vertex 0 which has an outdegree of 0 and an indegree
of 3 but a clustering coefficient of 2.

Hmm. I could reproduce this. Looks like gt somehow ignores that out_degree
is 0 and is counting it to be 3 (which is actually in-degree). Then the
maximum number of links among neighbours would be at most 3C2 = 3 but
actually 6 are present. So it simply takes the ration 6/3 = 2. Tiago would
be able to say more about this though.

Best
Snehal

attachment.html (2.17 KB)

This is indeed a bug. It has been fixed in git.

Best,
Tiago

Great; thank you very much! Do you have any idea when it might make its way
into the apt-get repository? Running Ubuntu 14.04 I recall building the
package myself having been a major pain due to Boost.

Hi Tiago,

Could I just briefly follow up on this? I realise this may not be a priority
for you (which is fine) but if you have a rough estimate of when it make its
way into the stable release (days, weeks, months) that would greatly help me
to decide whether to embark on spending a lot of time on trying to compile
the git version it or whether it is just a matter of waiting a few more
days.

Best,

Philipp

It is difficult to make a prediction, so I would rather not make one. If you
need to use it ASAP, you are better off compiling it yourself.