graph-tool and Macports

Hi there!

To answer my previous question myself: The trouble with dl and the like ceases once one creates a DLFCN.py file via the h2py.py script inside the python 2.7 source distribution. This file must be kept somewhere reachable to python - for example in the graph-tool python directory. Morever, I fixed the problem with the wrong installation paths in the Portfile. As a result I obtained a fully working graph-tool inside macports. Note, that I compile numpy and scipy with the apple gcc which is not the default in Macports.

When loading graph-tool, I get though this error:

Traceback (most recent call last):
  File "../../python/grid2gml.py", line 3, in <module>
    from graph_tool.all import *
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graph_tool/all.py", line 31, in <module>
    from graph_tool.draw import *
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graph_tool/draw/__init__.py", line 69, in <module>
    libc.open_memstream.restype = ctypes.POINTER(ctypes.c_char)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ctypes/__init__.py", line 366, in __getattr__
    func = self.__getitem__(name)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ctypes/__init__.py", line 371, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: dlsym(RTLD_DEFAULT, open_memstream): symbol not found

This seems to relate to new changes in git. Included is the Portfile I used which now has openmp disabled by default, since this will crash the apple gcc anyway...

Cheers,

Portfile.bin (1.38 KB)

Hi Sebastian,

Hi there!

To answer my previous question myself: The trouble with dl and the
like ceases once one creates a DLFCN.py file via the h2py.py script
inside the python 2.7 source distribution. This file must be kept
somewhere reachable to python - for example in the graph-tool python
directory.

Strange, I don't see this problem at all. Are you using the standard
python from macports? The DLFCN module should be there, since it is an
standard module which should be defined for MacOS.

Morever, I fixed the problem with the wrong installation
paths in the Portfile. As a result I obtained a fully working
graph-tool inside macports. Note, that I compile numpy and scipy with
the apple gcc which is not the default in Macports.

Except for the compilation segfault with +openmp, I was also able to
install graph-tool with macports. I did not have to fiddle with DLFCN.

If DLFCN is somehow buggy on macports, it could easily be replaced by
ctypes. Could you try replacing it in the dl_import.py file (just
replace DLFCN by ctypes at the beginning)?

When loading graph-tool, I get though this error:

Traceback (most recent call last):
  File "../../python/grid2gml.py", line 3, in <module>
    from graph_tool.all import *
  File

"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graph_tool/all.py",
line 31, in <module>

    from graph_tool.draw import *
  File

"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graph_tool/draw/__init__.py",
line 69, in <module>

    libc.open_memstream.restype = ctypes.POINTER(ctypes.c_char)
  File

"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ctypes/__init__.py",
line 366, in __getattr__

    func = self.__getitem__(name)
  File

"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ctypes/__init__.py",
line 371, in __getitem__

    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: dlsym(RTLD_DEFAULT, open_memstream): symbol not found

This seems to relate to new changes in git.

Hm, yes, I hadn't tested this it on macos yet. You can just comment out
the offending line 69 in graph_tool/draw/__init__.py, and this should
just work for you. I'll work on a proper fix.

Included is the Portfile I used which now has openmp disabled by
default, since this will crash the apple gcc anyway...

Thanks. I will put this version on the website.

Cheers,
Tiago

Hi!

Strange, I don't see this problem at all. Are you using the standard
python from macports? The DLFCN module should be there, since it is an
standard module which should be defined for MacOS.

Nope, it is not. At least on my machine macports does not install any DLFCN.py.

Morever, I fixed the problem with the wrong installation
paths in the Portfile. As a result I obtained a fully working
graph-tool inside macports. Note, that I compile numpy and scipy with
the apple gcc which is not the default in Macports.

Except for the compilation segfault with +openmp, I was also able to
install graph-tool with macports. I did not have to fiddle with DLFCN.

If DLFCN is somehow buggy on macports, it could easily be replaced by
ctypes. Could you try replacing it in the dl_import.py file (just
replace DLFCN by ctypes at the beginning)?

I changed the respective line to

    from ctypes import RTLD_LAZY, RTLD_GLOBAL

which does not work:

Traceback (most recent call last):
  File "../../python/grid2gml.py", line 3, in <module>
    from graph_tool.all import *
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graph_tool/__init__.py", line 90, in <module>
    from dl_import import *
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graph_tool/dl_import.py", line 28, in <module>
    from dl import RTLD_LAZY, RTLD_GLOBAL
ImportError: No module named dl

Apparently, he falls back to the dl module, which is not installed on macports python.

When loading graph-tool, I get though this error:

Traceback (most recent call last):
File "../../python/grid2gml.py", line 3, in <module>
   from graph_tool.all import *
File

"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graph_tool/all.py",
line 31, in <module>

   from graph_tool.draw import *
File

"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graph_tool/draw/__init__.py",
line 69, in <module>

   libc.open_memstream.restype = ctypes.POINTER(ctypes.c_char)
File

"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ctypes/__init__.py",
line 366, in __getattr__

   func = self.__getitem__(name)
File

"/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ctypes/__init__.py",
line 371, in __getitem__

   func = self._FuncPtr((name_or_ordinal, self))
AttributeError: dlsym(RTLD_DEFAULT, open_memstream): symbol not found

This seems to relate to new changes in git.

Hm, yes, I hadn't tested this it on macos yet. You can just comment out
the offending line 69 in graph_tool/draw/__init__.py, and this should
just work for you. I'll work on a proper fix.

Thats what I did. After commenting this out, things work.

BTW, I saw this

http://www.boost.org/doc/libs/1_46_1/libs/python/doc/tutorial/doc/html/python/techniques.html#python.reducing_compiling_time

I guess it is quite tedious to split every function declaration in a separate cpp file (and you do already to some extent in graph-tool), but using this technique a little more would help in cutting down compilation efforts. I am using here a machine with 8GB of RAM for the compilation and when I use 2 compilation processes in parallel my machine is swapping a lot! Apart from the fact that it takes really long to compile things. Ccache somehow did not help here to much for some reason as I always have to reject all compilations and start from scratch. Maybe the ccache macports integration will help here though, I will try.

Cheers,

Sebastian

Strange, I don't see this problem at all. Are you using the standard
python from macports? The DLFCN module should be there, since it is an
standard module which should be defined for MacOS.

Nope, it is not. At least on my machine macports does not install any
DLFCN.py.

Well, this is obviously a problem with the python installation of
macports, right? This is a standard python module, which should be
installed...

Here is the issue in detail: We need to make sure that the C++ RTTI
(run-time type information) works properly when the compiled modules are
imported from python. The only way to do this, is by ensuring the
RTLD_GLOBAL flag is passed to the dlopen() function, when python loads
the module. This is not a bug, either in graph-tool or python, it is
just how life is (see GCC Frequently Asked Questions - GNU Project). Python provides
an interface for doing this, namely the sys.setdlopenflags()
function. This is all nice and well, but we need to know the numeric
values of the flags we want to pass, and we need to pass always two
flags: RTLD_LAZY | RTLD_GLOBAL (or RTLD_NOW | RTLD_GLOBAL, for immediate
symbol resolution). Thus, these two flags must be defined somewhere,
since their numeric values will vary across OS's. These two values used
to be inside the "dl" module, which is now deprecated and has even been
removed in python 3. To close this hole, they have created the DLFCN
module, whose _only purpose_ AFAIK is to contain these important
definitions. Thus, without this module we are completely stuck. The
ctypes module, for whatever reason, does not include either RTLD_LAZY or
RTLD_NOW (as you have already noticed).

So, we really need the DLFCN module. I've checked, and in my macports
installation the module is also not installed... However it does have
the dl module, which I then used as a fall-back.

I guess we have to submit a bug report the macports people.

BTW, I saw this

General Techniques - 1.46.1

I guess it is quite tedious to split every function declaration in a
separate cpp file (and you do already to some extent in graph-tool),
but using this technique a little more would help in cutting down
compilation efforts. I am using here a machine with 8GB of RAM for the
compilation and when I use 2 compilation processes in parallel my
machine is swapping a lot! Apart from the fact that it takes really
long to compile things.

These techniques are already extensively used... But the compile
time/memory behavior in graph-tool has little to do with boost::python,
and it has more to do with the fact we are generating a whole bunch of
template instantiations for the algorithms themselves (for the different
types of property maps, filtering, etc). I already split things in
independent compilation units, to avoid even larger memory spikes, but
the whole thing is quite suboptimal at this point. The true culprit is
of course GCC, since I don't think it makes sense to use around 2GB RAM
to compile a 5MB object file.

Ccache somehow did not help here to much for some reason as I always
have to reject all compilations and start from scratch. Maybe the
ccache macports integration will help here though, I will try.

ccache works very reliably for me (even on macos). I only need to
recompile things if the source code really changes in some meaningful
way. Simple modifications, such as comments, indentations, etc, should
never trigger a recompile. And yes, you should of course enable it in
the macports config file...

Cheers,
Tiago

Hi!

Nope, it is not. At least on my machine macports does not install any
DLFCN.py.

Well, this is obviously a problem with the python installation of
macports, right? This is a standard python module, which should be
installed...

... well, yes I guess and on top its easy to generate by the h2py.py script.

Here is the issue in detail: We need to make sure that the C++ RTTI
(run-time type information) works properly when the compiled modules are
imported from python. The only way to do this, is by ensuring the
RTLD_GLOBAL flag is passed to the dlopen() function, when python loads
the module. This is not a bug, either in graph-tool or python, it is
just how life is (see GCC Frequently Asked Questions - GNU Project). Python provides
an interface for doing this, namely the sys.setdlopenflags()
function. This is all nice and well, but we need to know the numeric
values of the flags we want to pass, and we need to pass always two
flags: RTLD_LAZY | RTLD_GLOBAL (or RTLD_NOW | RTLD_GLOBAL, for immediate
symbol resolution). Thus, these two flags must be defined somewhere,
since their numeric values will vary across OS's. These two values used
to be inside the "dl" module, which is now deprecated and has even been
removed in python 3. To close this hole, they have created the DLFCN
module, whose _only purpose_ AFAIK is to contain these important
definitions. Thus, without this module we are completely stuck. The
ctypes module, for whatever reason, does not include either RTLD_LAZY or
RTLD_NOW (as you have already noticed).

I know, I read a lot about it recently. However, the idea of having shared libraries is not that appreciated on Mac - about every stupid piece of software comes along with its own libraries. The only common ground on has - as I see it - is the apple gcc...

So, we really need the DLFCN module. I've checked, and in my macports
installation the module is also not installed... However it does have
the dl module, which I then used as a fall-back.

Hmm, I don't have a dl module in the macports python, but in the apple python. However, in the past I deleted some stuff from the Developer directory structure, such that I should have a dl module.

I guess we have to submit a bug report the macports people.

Let me reinstall my Xcode and then we file a bug report, if things stay as is.

ccache works very reliably for me (even on macos). I only need to
recompile things if the source code really changes in some meaningful
way. Simple modifications, such as comments, indentations, etc, should
never trigger a recompile. And yes, you should of course enable it in
the macports config file...

Next time I compile graph-tool, I will enable ccache and share that new Portfile.

CU,

Sebastian