Optimizing distance computation in a periodic space.

Guillaume · December 17, 2012, 10:11am

Hi list.

I am using graph-tool to model an /apical junctions/ network, based on
the model by Farhadifar et al.
<http://www.sciencedirect.com/science/article/pii/S0960982207023342>\. In
short, it corresponds to the outer surface of the cells in a particular
region of the embryo, where each cell is represented by a polygon.
Topology of the network changes due to cell division and cell death.

The physical model involves the local minimization of an energy
depending on the local geometry.

I got a working model, but the energy optimization is time consuming. I
think that this is due to the way I compute the components of the vector
formed by each edge of the graph, by iterating over the edges., eg:

for edgein graph.edges():

         v0, v1 = edge.source(), edge.target()
         deltax[edge] = x[v1] - x[v0]
         deltay[edge] = y[v1] - y[v0]|

This is further complicated (to a little extent) by periodic conditions
to the coordinate systems, and some other details, but I think this is
the core issue.

I have the feeling I can do better, perhaps by using the adjacency
matrix? This would be particularly interesting as the graph topology
doesn't change for a given optimization.

Any advice is welcome, thanks!

Guillaume

attachment.html (4.88 KB)

tiago · December 17, 2012, 10:14am

Hi Guillaume,

Take a look at the edge_difference() function, since I think it does
what you want, and should be significantly faster.

Cheers,
Tiago

Guillaume · December 17, 2012, 10:28am

Hi Tiago,

Thanks for the blazing fast answer!

It will sure do the job. I can't find it in the documentation though
(it's accessible online via ipython autocompletion and magic '?', but
doesn't appear on your web site).

Cheers,

Guillaume

attachment.html (2.75 KB)

tiago · December 17, 2012, 10:29am

Oops... Indeed. I'll add it in the next release.

Cheers,
Tiago

Guillaume · December 17, 2012, 10:59am

Ok cool..

One more point:

Is there a way to do the following differently:

edge_x = g.new_edge_property()
edge_x.a = np.array([x[e.source()] for e in g.edges()])

with x being a vector property map.
I plain English, is there a (more efficient) way to copy the vertex
property map of every edge source to an edge property map?

Guillaume

attachment.html (1.69 KB)

Ronnie_Ghose · December 17, 2012, 12:14pm

also i assume you're using a numpy / C structure for all of this right?
Also depending on the number of cores you have you could try parallelizing
all of your loops with multiprocessing / if applicable ~ GPU processing

attachment.html (2.43 KB)

Guillaume · December 17, 2012, 1:38pm

also i assume you're using a numpy / C structure for all of this
right? Also depending on the number of cores you have you could try
parallelizing all of your loops with multiprocessing / if applicable
~ GPU processing

Well I rely on the `propertymap.a` and `propertymap.fa` attributes,
which are subclasses of numpy ndarray. As for multiprocessing, of
course, but I only have 4 cores, so... As for GPU processing, I don't
know whether there is a simple way to do this in python.
I guess the real efficient way would be to dive into the underlying C++
API, but I am not very C++ litterate and look for something simpler.

But avoiding the inelegant loop over all edges is already interesting...

Thanks for the suggestion anyway!

Guillaume

attachment.html (4.74 KB)

Ronnie_Ghose · December 17, 2012, 1:42pm

Ehh these are sort of trivial things now but why not use a multidimensional
array for the coords [n,2] and then use the numpy operators to find the
difference? Just trying to reduce the things you do and offload more to the
numpy fortran library. Uhm hmm you can use 4 cores then ~ just split it
into 4 chunks to subtract? I find personally numpy does for loop operations
tons faster ~ so a vectorized function rather than a for loop...

attachment.html (5.38 KB)

Guillaume · December 17, 2012, 1:50pm

Ehh these are sort of trivial things now but why not use a
multidimensional array for the coords [n,2] and then use the numpy
operators to find the difference?

Well I guess that would reduce some overloading, but the reason I keep
my coordinates separated is for clarity. The coordinate system is a bit
more complicated than just (x,y), it's more cynlindrical like, with two
curvilinear coordinates for the surface, plus a radius corresponding to
the distance of the surface w/r to the central axis... I think that
explicitly writing down every coordinate separately avoids mistakes,
though the code looks a bit redondant then.

G.

attachment.html (8.93 KB)

tiago · December 17, 2012, 2:14pm

Something like this is not yet available, you have to use a python loop
for now.

However I plan to add something like this at some point, since it occurs
often enough. If you wish, you may open a ticket, and I'll address when
time permits.

Cheers,
Tiago

Guillaume · December 17, 2012, 2:15pm

Ok I'll do that, thanks!

G.

attachment.html (1.54 KB)

Ronnie_Ghose · December 17, 2012, 3:50pm

ehh in contradiction to what I said earlier ~ vectorized operations provide
a giant speed up. ~

http://i.imgur.com/Oo92T.png

-> i'm using numpy arrays in both cases. How about having the declarations
be more simple and then packing / unpacking only for the math portion? Also
you could split this into chunks + do it in parallel to get some additional
speed boosts.

attachment.html (2.49 KB)