2014-05-26

Installing Scrapy on an Amazon CentOS AMI

We're experimenting with Scrapy, and I thought I'd share what I found while installing the Scrapy package, as it has multiple dependencies many Python installations don't normally include, and those are not listed in the documentation.

* First off, you want bzip2:

$ cd /tmp
$ wget http://bzip.org/1.0.6/bzip2-1.0.6.tar.gz
$ tar -xzf bzip2-1.0.6.tar.gz
$ cd bzip2-1.0.6
$ sudo make -f Makefile-libbz2_so
$ sudo make
$ sudo make install PREFIX=/usr/local
$ sudo cp libbz2.so.1.0.6 /usr/local/lib



* Then you want the libffi headers

$ sudo yum install libffi-devel

* Then you want to download the Python source and extract it:

$ wget https://www.python.org/ftp/python/2.7.6/Python-2.7.6.tgz
$ tar -xzf Python-2.7.6.tgz
$ cd Python-2.7.6

* I usually uncomment any lines referencing ssl and zlib in Modules/Setup.dist

$ vi Modules/Setup.dist
# find and uncomment the lines

* Then build Python:

$ ./configure --prefix=/usr/local
$ sudo make
$ sudo make altinstall

* Install virtualenv if you don't have it (you should)

* Activate your virtualenv with your fresh Python

$ cd
$ virtualenv -p /usr/local/bin/python2.7 myenv
$ source myenv/bin/activate

* Install Scrapy

$ easy_install Scrapy

This was tested on an Amazon AWS image named amzn-ami-pv-2013.09.2.x86_64-ebs (ami-a43909e1), known as Amazon Linux AMI x86_64 PV EBS with the following version strings:

$ cat /proc/version
Linux version 3.4.73-64.112.amzn1.x86_64 (mockbuild@gobi-build-31003) (gcc version 4.6.3 20120306 (Red Hat 4.6.3-2) (GCC) ) #1 SMP Tue Dec 10 01:50:05 UTC 2013

$ cat /etc/*-release
Amazon Linux AMI release 2014.03