how to install lxml python module on mac os 10.5 (leopard)

lxml is an xml library for python that doesn’t suck. It needs a recent libxml2 and libxslt. Mac OS X does not come with recent versions, and 10.5 breaks completely if you force it to try and use a recent version.

I used to use MacPorts for everything (which comes with py25-lxml) but ran into some issues with 10.5. So back to manual installs, it is.

  • assuming your python is the mac os x default…
  • make sure no traces of other pythons in your $PATH
  • download and install libxml2:
       ./configure --prefix=/usr/local/libxml2-2.7.0
       make
       sudo make install
       cd /Library/Python/2.5/site-packages
       sudo ln -s /usr/local/libxml2-2.7.0/lib/python2.5/site-packages/* .
    
  • download and install libxslt:
       ./configure --prefix=/usr/local/libxslt-1.1.24 --with-libxml-prefix=/usr/local/libxml2-2.7.0 
       make
       sudo make install
       cd /Library/Python/2.5/site-packages
       sudo ln -s /usr/local/libxslt-1.1.24/lib/python2.5/site-packages/* .
    
  • download and install lxml:
       sudo python setup.py install \
         --with-xml2-config=/usr/local/libxml2-2.7.0/bin/xml2-config \
         --with-xslt-config=/usr/local/libxslt-1.1.24/bin/xslt-config
    

Update March 1, 2009: in order to get libxml2 and libxslt python bindings working on 10.5.6, with a python.org 2.6 version of python, I had to do quite a bit more fiddling. If the above doesn’t work for you, try this:

cd libxml2-2.7.0
autoreconf
./configure --prefix=/usr/local/libxml2-2.7.0
make
sudo make install
cd ../libxslt-1.1.24
autoreconf
./configure --prefix=/usr/local/libxslt-1.1.24 \
    --with-python \
    --with-libxml-prefix=/usr/local/libxml2-2.7.0
make
sudo make install
cd ../libxml2-2.7.0/python

# libxml2 build doesn't support do -arch ppc
export ARCHFLAGS='-arch i386'

# setup.py supports libxslt install, sort-of
cp ../../libxslt-1.1.24/python/libxsl* .
cp ../../libxslt-1.1.24/python/generator.py xsltgenerator.py
cp ../../libxslt-1.1.24/doc/libxslt-api.xml .

# hack the setup.py file to learn the dir structure
mv setup.py setup.py.bak
curl 'http://pastebin.com/pastebin.php?dl=f158d0a96' > setup.py
python setup.py build
sudo python setup.py install

And then install lxml normally.

(the use of autoreconf is explained by DarwinPorts folks, the use of ARCHFLAGS is explained by apple, and the hand-editing/hand-merging of libxslt and libxml python bindings for the build based on experimental fiddling.)

21 thoughts on “how to install lxml python module on mac os 10.5 (leopard)”

  1. Yep, very useful. It’s easy to install on Linux, but on Mac installs can be a pain… I think I would have given up lxml without your explanation, and possibly Python altogether. Thanks a lot.

  2. I struggled with installing libxml2 and the related python packages for Python 2.4. I tried building, compiling, scratching and clawing but had trouble with my specific needs. I’m no expert but wrote up what worked for me here:

    http://taocode.blogspot.com/

    I needed this package for IMS Transport and Marshal XML all for Plone. Thank you for providing this information, hopefully my experience can help others get up and running too.

  3. Thanks a lot. I did not exactly follow your steps, but I made it with fink and the sources. I had multiple python installation, but gave the correct path in compilation. It worked.

  4. Thanks! I followed your steps exactly, except that i used libxml2-2.7.3 instead of libxml2-2.7.0. I had tried to get lxml installed a few times since January and failed each time. (I was missing the –prefix and –with–prefix options, as well as the site-packages symlinks) Can’t understand why they don’t just make the documentation on the lxml/libxml2/libxslt sites more comprehensive. Thanks again!

  5. Apparently I’m missing something. I’ve been through these instructions a couple of times now and I keep coming up against the same error every time I try install lxml. Everyone else commenting on this post seems to have gotten it to work for them, anyone have the secret?

    I’m running 10.5.6 with the default python 2.5 installation. the only thing I’ve done that differed from the post is use libxml2-2.7.3.

    The error:

    Building against libxml2/libxslt in one of the following directories:
    /usr/local/libxml2-2.7.3/lib
    /usr/local/libxslt-1.1.24/lib
    ld warning: in /usr/local/libxslt-1.1.24/lib/libxslt.dylib, file is not of required architecture
    ld warning: in /usr/local/libxslt-1.1.24/lib/libexslt.dylib, file is not of required architecture
    ld warning: in /usr/local/libxml2-2.7.3/lib/libxml2.dylib, file is not of required architecture
    ld warning: in /usr/local/libxslt-1.1.24/lib/libxslt.dylib, file is not of required architecture
    ld warning: in /usr/local/libxslt-1.1.24/lib/libexslt.dylib, file is not of required architecture
    ld warning: in /usr/local/libxml2-2.7.3/lib/libxml2.dylib, file is not of required architecture
    No eggs found in /tmp/easy_install-EUw2w0/lxml-2.1.5/egg-dist-tmp-0HIXao (setup script problem?)
    error: Could not find required distribution lxml==2.1.5

    Thanks in advance for any help!

  6. Hi TJ Ward,

    I ran into the same problems and same error message as yours. That might be obvious to the other readers but it wasn’t for me. You need the setuptools package installed (providing the easy_install utility). You can download and find the installation instructions here:

    http://pypi.python.org/pypi/setuptools

    It solved everything for me. Hope that’s your problem and that fixes it.

    Thanks to the blog author for this easy walkthrough!

    Regards,
    Jimmy

  7. I have just followed this tutorial, using the instructions from “Update March 1, 2009”. But instead of hacking setup.py I’ve used the original file and executed “sudo python2.6 setup.py install –with-xml2-config=/usr/local/lib/libxml2-2.7.3/bin/xml2-config –with-xslt-config=/usr/local/lib/libxslt-1.1.24/bin/xslt-config”. Note that the libs are grouped under the /usr/local/lib directory.

  8. I tried using pip to install (pip install lxml) and got a huge gcc error chain:


    src/lxml/lxml.etree.c: At top level:

    src/lxml/lxml.etree.c:130036: error: invalid application of ‘sizeof’ to incomplete type ‘struct __pyx_obj_4lxml_5etree__ParserSchemaValidationContext’

    lipo: can’t figure out the architecture type of: /var/folders/2H/2HiejGSYHRWzNXbn0c7wzk+++TI/-Tmp-//cc1J4hwT.out

    error: command ‘gcc’ failed with exit status 1

    Using easy_install alone worked fine with no need to handpack anything:

    easy_install lxml

    I really advise against using macports for python libraries. Use pip where possible, otherwise easy_install.

  9. Many thanks! Especially to Leo Simons but also Jimmy Royer. I am just starting on Python, and this discussion made an impossible job possible. My iMac is panther 10.3.9 (PPC of course). That made it even harder. To use lxml, the /usr/bin/python was too downrev. Fink’s python is also too old, and Macport can’t build python because it can’t build tk. Luckily I was able to get python 2.6.2 and easy_install from python.org.

    Unfortunately, the “make” bombs for both libxml2 and libxlst. In both cases, configure fails to set $echo in the output libtools script, for some reason. The fix was to “vi libtools” and add a line just after the comment which says “# Check that we have a working $echo”
    echo=”echo” # configure fails to define $echo when generating libtool, so set $echo here

    Just to clarify:
    * My autoreconf comes from Macports – dunno if that makes a difference but it worked
    * The autoreconf generates some warnings that I ignored (safely, I think)
    * Do: export ARCHFLAGS=’-arch ppc’
    * Just after the curl command, edit/fix the paths by doing: “vi setup.py”

    Thank you, again!

  10. Thanks, I’d been sweating over this all afternoon.
    Never known anything as hard to install as it is to install lxml on OS X!

    I can report the second method (“Update March 1, 2009…”) works on OS X 10.5.8 with python 2.6.2, lxml 2.2.2, libxml2-2.7.3, and libxsl2-1.1.24.

    I had a lot of problems at first with the lxml installation complaining about architecture types but fresh downloads of everything seemed to fix it.

    Hope the lxml team take this into account in their installation instructions.

    Thanks again.

  11. I’m seeing the following, any suggestions?

    make all-recursive
    Making all in include
    Making all in libxml
    make[3]: Nothing to be done for `all’.
    make[3]: Nothing to be done for `all-am’.
    Making all in .
    /bin/sh ./libtool –tag=CC –mode=compile gcc -DHAVE_CONFIG_H -I. -I./include -I./include -D_REENTRANT -g -O2 -pedantic -W -Wformat -Wunused -Wimplicit -Wreturn-type -Wswitch -Wcomment -Wtrigraphs -Wformat -Wchar-subscripts -Wuninitialized -Wparentheses -Wshadow -Wpointer-arith -Wcast-align -Wwrite-strings -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -Wnested-externs -Winline -Wredundant-decls -MT SAX.lo -MD -MP -MF .deps/SAX.Tpo -c -o SAX.lo SAX.c
    ./libtool: line 460: CDPATH: command not found
    /usr/local/src/libxml2-2.7.4/libtool: line 460: CDPATH: command not found
    /usr/local/src/libxml2-2.7.4/libtool: line 1138: func_opt_split: command not found
    libtool: Version mismatch error. This is libtool 2.2.6, but the
    libtool: definition of this LT_INIT comes from an older release.
    libtool: You should recreate aclocal.m4 with macros from libtool 2.2.6
    libtool: and run autoconf again.
    make[2]: *** [SAX.lo] Error 63
    make[1]: *** [all-recursive] Error 1
    make: *** [all] Error 2

    1. Hey Michael, that looks like a rather weird error. I think you have to do what it says, and re-run autoconf. That is what the “autoreconf” command is for in the updated instructions. Did you try that?

      If so, try deleting the ‘./libtool’ file and the ‘./aclocal.m4’ file, and then re-run autoreconf. If that doesn’t fix it, please provide full details of your environments and the commands you’re running.

Comments are closed.

%d bloggers like this: