Property Maps

Alan_Williams · November 3, 2013, 3:07am

Hi there, sorry for the newb-ish question. I have read the docs for an hour

I have a graph as a "dot" text file. Nodes are numbered consecutively from
1 to N. example below. I have used the *minimise_blockmodel_dl *method,
which returns a PropertyMap b. My question is how to save a map from the
dotfile's Node ID to community number as a txt file. I have saved the
propertymap array b.get_array() using numpy.savetxt and it appears to be a
one-dimensional array.

b, mdl = minimize_blockmodel_dl(g,max_B=100,min_B=200,verbose=True)
array = b.get_array()
np.savetxt('1m_output_blocks_array.txt',array,'%i')

I kind of assumed that the PropertyMap would be returned in order so I
could just lay a 1-N column beside b.get_array() and that would be what I
want. However, I'm not getting results that make sense to me and I wanted
to find out if this assumption is correct or not.

I'd also like to know the proper way of doing this for the general case
where nodes could be arbitrarily labelled in the original dotfile.

Thank you very much
Alan

graph G {
1;
2;
...
1000000;
1 -- 2;
3 -- 5;
...
}

attachment.html (1.43 KB)

Alan_Williams · November 3, 2013, 3:23am

I've just realized that the input nodes are seen as strings from a dotfile,
in which case if they are sorted, it may be lexicographically i.e.
1,10,11,12,13,14,15,16,17,18,19,2,20,21 etc.
Does graph-tool sort the node id's internally, i.e. can I post-process the
community output with this ordering? ...quite keen to do that as the
community assignment took about 3 days to run

thank you very much
Alan

attachment.html (471 Bytes)

tiago · November 3, 2013, 10:00am

Hi there, sorry for the newb-ish question. I have read the docs for an hour

I have a graph as a "dot" text file. Nodes are numbered consecutively
from 1 to N. example below. I have used the /minimise_blockmodel_dl
/method, which returns a PropertyMap b. My question is how to save a
map from the dotfile's Node ID to community number as a txt file. I
have saved the propertymap array b.get_array() using numpy.savetxt and
it appears to be a one-dimensional array.

b, mdl = minimize_blockmodel_dl(g,max_B=100,min_B=200,verbose=True)

(Side note: This is not right, you must have max_B >= min_B. I assume it
is a typo.)

array = b.get_array()
np.savetxt('1m_output_blocks_array.txt',array,'%i')

I kind of assumed that the PropertyMap would be returned in order so I
could just lay a 1-N column beside b.get_array() and that would be
what I want. However, I'm not getting results that make sense to me
and I wanted to find out if this assumption is correct or not.

I'd also like to know the proper way of doing this for the general
case where nodes could be arbitrarily labelled in the original
dotfile.

In the dot format the labels are arbitrary strings, so the ordering
assumed in graph-tool is (as you guessed in your second message)
lexicographic, not according to the line order in the file or the
numeric value of the labels. The names of the nodes are stored as
strings in a internal property map called "vertex_name":

    label = g.vp["vertex_name"]
    for v in g.vertices():
        print(label[v])

This should print something like:

    0
    1
    10
    100
    ...

The property map arrays are in same order, so using "vertex_name" you
should be able to store the mapping to a file.

Note that you could also store the block assignments to an internal
property map and save that to a file together with the whole network:

g.vp["b"] = b
g.save("foo.dot")

Let me advise you, however, that the dot format is not the best one to
keep stored metadata, since it has no type information on the stored
properties. So they are always read as strings, and you need to convert
them later:

     g = load_graph("foo.dot")
     b = g.vp["b"] # this is always a string property map
     b = b.copy(value_type="int") # converts to int

As I mention in the documentation, a much better choice is to use the
graphml format, which stores the graph and all its internal property maps
perfectly:

     g.save("foo.xml")
     g = load_graph("foo.xml")
     b = g.vp["b"] # already has int type

I recommend using the dot or gml formats _only_ if you need to pass to or
read from another program/library which uses this format, otherwise you
should stick to graphml.

Cheers,
Tiago