On Sun, Feb 10, 2013 at 1:22 AM, Tiago de Paula Peixoto <tiago@skewed.de> wrote:
On 02/09/2013 08:37 PM, tcb wrote:
>
> OK, I think with the strictly correct approach to graphml it might be
> possible to read the attributes (if you also write a schema)- but you are
> right that it is a lot of added complexity for the fairly simple extensions
> you are using (especially perhaps for reading). On the other hand, other
> tools then have to write custom readers, which kind of defeats one of the
> major benefits of graphml- but others seems to have taken this approach too.

Do you know of any current graphml readers which understand schemas as
well? It seems that reader modification would be unavoidable.


You're right- its definitely much more complexity- I must see how gephi deals with it, but I haven't yet...

I suppose I would be much happier with the complexity of graphml if it could guarantee seamless transfer of graphs between different tools. I would like to be able to work in graph-tool, networkx and occasionally some others (gephi perhaps) without having to maintain a bunch of conversion scripts.
 
> The modifications to read the vector_* types might be easy enough to make on
> the networkx side- we'll see how it goes.

Given the very adequate fromhex() function you found, a trivial reader for
the (double, float, long double) vector properties would be something like:

    >>> prop = '0x1.0000000000000p+1, 0x1.8000000000000p+1'
    >>> vec = [float.fromhex(x) for x in prop.split(",")]
    >>> print(vec)
    [2.0, 3.0]

> Alternatively, I suppose I could get by without modifying anything by making
> two 'float' property maps (pos.x and pos.y)- then combine them again on the
> networkx side- but this is a bit cumbersome.

This would be a way to ensure a strictly valid graphml file, and is
feasible for users which require it. Another alternative would be to
encode it as a string, and decode at the other end.

It is important to note however that in graph-tool regular float/double
properties (not vectors) are also stored in hex format. I'm not sure if
this is a violation of the standard, since I don't know if they specify
exactly how a float is represented, but this may cause problems with
interoperability as well (possibly networkx would also choke). I made
this choice because for my own uses, it is very important that no
information is lost during encoding.


I don't think this will be an issue- in fact its a very good idea to preserve the exact float representation.
 
> The other thing is that I have no idea how to write a graphml file from
> networkx that graph-tool could understand.

To simply write a vector type, it would suffice to do the following for
a float type:

    >>> prop = ", ".join(float.hex(x) for x in vec)
    >>> print(prop)
    '0x1.0000000000000p+1, 0x1.8000000000000p+1'

This should be completely readable by graph-tool.

More work would be required to support the python::object properties, if
so desired, but not much. It is just a base64 encoding of a pickled
object.


I'm not sure how much effort is warranted to get complete support- I haven't needed to store pickled python objects yet, so I'll stick with the vector_* and see how it goes with that first.
 
> It would be nice if there was some format which could be easily used to work
> between graph-tool and networkx (and possible other tools aswell). Do you
> have any suggestions on what would be a better fit?

I think graphml is still better than anything else out there. GML, for
instance, has an even cruder type system (basically only string and
float), and in dot everything is a string.

In the end, I'm obviously biased towards the way it is implemented in
graph-tool (graphml + custom types), since it is easy enough to
implement, and one can guarantee perfect representation of the graph and
its properties.


And for that it's quite a sensible approach.

For working with multiple tools I am leaning towards a json approach. The networkx node_link format is quite useful:

  http://networkx.github.com/documentation/latest/reference/readwrite.json_graph.html

and extremely easy to work with (at least for my purposes). You can easily encode pretty much every type of data. For graph-tool you would just need to include a small bit of information about the property maps, which networkx et al can easily ignore. 

thanks

-
 
Cheers,
Tiago

--
Tiago de Paula Peixoto <tiago@skewed.de>


_______________________________________________
graph-tool mailing list
graph-tool@skewed.de
http://lists.skewed.de/mailman/listinfo/graph-tool