Monday, February 1, 2010

Playing with SciPy (part 1)

This is a bit of a shaggy dog story. I want to install SciPy and play with it. I am particularly interested in the statistics.

Now, if you're funded, I would recommend that you just pay Enthought $$ for their distribution. It's relatively cheap, well tested, and contains everything but the kitchen sink. But, for just fooling around I resisted (i.e. I'm cheap), figuring that sooner or later they will want more $$. And, this should be a learning experience.

Looking at the official Scipy install instructions for OS X, I tried just downloading SciPy following the link for binary downloads on their main page to Source Forge:

scipy-0.7.1_20091217-py2.6-macosx-10.6-universal.egg

I built and installed in the usual way using system Python, and ... I got a seg fault, which is a pretty spectacular failure to have.

So I decided to back up and try step by step. On OS X 10.6 Snow Leopard, Python comes installed as does Numpy:


$ python
Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.3.0'


Now, my MacBook is not a "virgin" install. I got it more than a year ago, and upgraded from Leopard, so the site-packages directory has more stuff than I can list conveniently (caps sorted to first):


$ ls /Library/Python/2.6/site-packages
Jinja2-2.2.1-py2.6.egg-info
Pygments-1.1.1-py2.6.egg-info
PyxMPI-1.0-py2.6.egg-info
README
SQLAlchemy-0.5.6-py2.6.egg-info
Sphinx-0.6.3-py2.6.egg
Sphinx-0.6.3-py2.6.egg-info


There are about 40 directories in all.


I found notes on the web for intalling SciPy on OS X:
• at SciPy
• on a blog
• Something called the SciPy Superpack linked here
• Using MacPorts discussed on SO

The ScipPy Superpack guy's shell script failed (multiple failures), but I neglected to write down why. I decided to follow hyperjeff's instructions (#2). I haven't used MacPorts (should look into it) but Fink, a similar project, screwed me over badly once and I'm a bit gun-shy.

Most everything I've read suggests that I should install a separate Python for SciPy.


So the remainder of this post (which is again in the theme of 'so google can organize my head') is about Python and about "architecture" and 32-bit v. 64-bit on OS X. There's a long article about 64-bit under Snow Leopard at Ars Technica here.

Suffice it to say that the coming (present?) era is 64-bit. That means among other things that address space is not limited to 4GB, the processor is modernized as well as well as handling 64-bits, and this should be all in all "a good thing." The difficulty is that "at runtime, a process is either 32-bit or 64-bit, and can only load other code—libraries, plug-ins, etc.—of the same kind."

So, we now have to worry about individual components of our SciPy install being 32-bit or 64-bit, as well as the versioning issues (hinted at by the name of the SciPy "egg" shown above)---matching NumPy and SciPy and Python (as well as gfortran and gcc and ...)


You can check an executable from the command line. Apple's Python is in

/System/Library/Frameworks/Python.framework/Versions/2.6/bin/python2.6

but

/usr/bin/python

also launches it. (Don't ask me how, it' not a sym link). So, for example, we get:


$ file /usr/bin/python
/usr/bin/python: Mach-O universal binary with 3 architectures
/usr/bin/python (for architecture x86_64): Mach-O 64-bit executable x86_64
/usr/bin/python (for architecture i386): Mach-O executable i386
/usr/bin/python (for architecture ppc7400): Mach-O executable ppc


Apple's Python is both 32-bit and 64-bit (x86_64), and is a "universal" build with code to run on both Power PC chips and i386 (Intel) chips.

Another way to tell that Python is running as 64-bit is:


>>> 2**63-1 == sys.maxint
True


You can't just use the disk image of the binary Python from here. It installs Python into

/Library/Frameworks/Python.framework/Versions/2.6/bin/python2.6

with a sym link in


$ ls -al /usr/local/bin/python2.6
lrwxr-xr-x 1 root wheel 71 Jan 31 13:01 /usr/local/bin/python2.6 -> ../../../Library/Frameworks/Python.framework/Versions/2.6/bin/python2.6


Unfortunately, it's 32-bit.


$ file /usr/local/bin/python2.6
/usr/local/bin/python2.6: Mach-O universal binary with 2 architectures
/usr/local/bin/python2.6 (for architecture ppc): Mach-O executable ppc
/usr/local/bin/python2.6 (for architecture i386): Mach-O executable i386


And in the process of backing out from this blind alley, I discovered that this install something I did had hepfully modified my .bash_profile so that python pointed to the new install rather than Apple's python. (I thought it was Python originally, but it might have been the scipy-0.7.1-py2.6-python.org.dmg binary installer). Luckily, they saved a copy of the old file.

I yanked it all out (I think).


Building Python from source on OS X 10.6.2

I have Apple's Developer's tools installed


$ gcc --version
i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646) (dot 1)


I downloaded and built Python 2.7 without thinking---there's likely to be a versioning issue. I left it there to explore later.

So, back up again and get 2.6

17dcac33e4f3adb69a57c2607b6de246 13322131 Python-2.6.4.tgz


$ md5 ~/Desktop/Python-2.6.4.tgz 
MD5 (/Users/telliott_admin/Desktop/Python-2.6.4.tgz) = 17dcac33e4f3adb69a57c2607b6de246


Following the instructions


cd
./configure --enable-universalsdk --with-univeral-archs=64-bit
make


... fails. So... I should really have taken the time to figure this out.

[UPDATE: See here for solution]

But I decided that since I'm just playing, and won't necessarily want matplotlib's GUI yet, I can go ahead with a standard non-framework build.


make clean
./configure
make
sudo make install

ln -s /usr/local/bin/python2.6 ~/bin/python26

$ file /usr/local/bin/python2.6
/usr/local/bin/python2.6: Mach-O 64-bit executable x86_64


This is not a "framework" build. But it's 64-bit! It's on to Numpy!