Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2849841
  • 博文数量: 348
  • 博客积分: 2907
  • 博客等级: 中校
  • 技术积分: 2272
  • 用 户 组: 普通用户
  • 注册时间: 2010-03-12 09:16
个人简介

专注 K8S研究

文章分类

全部博文(348)

文章存档

2019年(22)

2018年(57)

2016年(2)

2015年(27)

2014年(33)

2013年(190)

2011年(3)

2010年(14)

分类: Python/Ruby

2013-07-04 13:38:19

一、 Scrapy简介

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

官方主页:

 

二、 安装Python2.7

官方主页:

下载地址:

 

1) 安装python

安装目录:D:\Python27

 

2) 添加环境变量

System Properties -> Advanced -> Environment Variables - >System Variables -> Path -> Edit

 

3) 验证环境变量

T:\>set Path Path=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;D:\Rational\common;D:\Rational\ClearCase\bin;D:\Python27;D:\Python27\Scripts PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH

 

4) 验证Python

复制代码
T:\>python
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
>>> exit() 
T:\>
复制代码

 

 

三、 安装Twisted

1) 安装setuptools

Download, build, install, upgrade, and uninstall Python packages -- easily!

官方主页:

下载地址:

安装过程:略

 

2) 安装Zope.Interface

官方主页:

下载地址:  或  

安装过程:

复制代码
T:\>d:
D:\>cd D:\Python27\Scripts
D:\Python27\Scripts>easy_install.exe zope.interface-4.0.1-py2.7-win32.egg Processing zope.interface-4.0.1-py2.7-win32.egg
creating d:\python27\lib\site-packages\zope.interface-4.0.1-py2.7-win32.egg
Extracting zope.interface-4.0.1-py2.7-win32.egg to d:\python27\lib\site-packages
Adding zope.interface 4.0.1 to easy-install.pth file

Installed d:\python27\lib\site-packages\zope.interface-4.0.1-py2.7-win32.egg
Processing dependencies for zope.interface==4.0.1 Finished processing dependencies for zope.interface==4.0.1 D:\Python27\Scripts>
复制代码

 

验证安装:

D:\Python27\Scripts>python
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
>>> import zope.interface >>>

 

3) 安装Twisted

官方主页:

下载地址:

安装过程:略

 

 

四、 安装w3lib

官方主页:

下载地址:

解压过程:略

安装过程:

T:\w3lib-1.2>python setup.py install
running install
running build
running build_py
creating build
creating build\lib
creating build\lib\w3lib
copying w3lib\encoding.py -> build\lib\w3lib
copying w3lib\form.py -> build\lib\w3lib
copying w3lib\html.py -> build\lib\w3lib
copying w3lib\http.py -> build\lib\w3lib
copying w3lib\url.py -> build\lib\w3lib
copying w3lib\util.py -> build\lib\w3lib
copying w3lib\__init__.py -> build\lib\w3lib
running install_lib
creating D:\Python27\Lib\site-packages\w3lib
copying build\lib\w3lib\encoding.py -> D:\Python27\Lib\site-packages\w3lib
copying build\lib\w3lib\form.py -> D:\Python27\Lib\site-packages\w3lib
copying build\lib\w3lib\html.py -> D:\Python27\Lib\site-packages\w3lib
copying build\lib\w3lib\http.py -> D:\Python27\Lib\site-packages\w3lib
copying build\lib\w3lib\url.py -> D:\Python27\Lib\site-packages\w3lib
copying build\lib\w3lib\util.py -> D:\Python27\Lib\site-packages\w3lib
copying build\lib\w3lib\__init__.py -> D:\Python27\Lib\site-packages\w3lib
byte-compiling D:\Python27\Lib\site-packages\w3lib\encoding.py to encoding.pyc
byte-compiling D:\Python27\Lib\site-packages\w3lib\form.py to form.pyc
byte-compiling D:\Python27\Lib\site-packages\w3lib\html.py to html.pyc
byte-compiling D:\Python27\Lib\site-packages\w3lib\http.py to http.pyc
byte-compiling D:\Python27\Lib\site-packages\w3lib\url.py to url.pyc
byte-compiling D:\Python27\Lib\site-packages\w3lib\util.py to util.pyc
byte-compiling D:\Python27\Lib\site-packages\w3lib\__init__.py to __init__.pyc
running install_egg_info
Writing D:\Python27\Lib\site-packages\w3lib-1.2-py2.7.egg-info

T:\w3lib-1.2>

验证安装:

T:\>python
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
>>> import w3lib >>> 

 

 

五、 安装libxml2

官方主页:

下载地址:

安装过程:略

验证安装:

T:\>python
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
>>> import libxml2 >>> 

 

 

六、 安装pyOpenSSL

官方主页:

下载地址:

安装过程:略

验证安装:

T:\>python
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
>>> import OpenSSL >>>

七、 安装lxml
下载地址:

 

八、 安装Scrapy

官方主页:

下载地址:

解压过程:略

安装过程:

复制代码
T:\Scrapy-0.14.4>python setup.py install

……
Installing easy_install-2.7-script.py script to D:\Python27\Scripts
Installing easy_install-2.7.exe script to D:\Python27\Scripts
Installing easy_install-2.7.exe.manifest script to D:\Python27\Scripts

Using d:\python27\lib\site-packages
Finished processing dependencies for Scrapy==0.14.4 T:\Scrapy-0.14.4>
复制代码

 

验证安装:

复制代码
T:\>scrapy
Scrapy 0.14.4 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  fetch         Fetch a URL using the Scrapy downloader
  runspider Run a self-contained spider (without creating a project) settings      Get settings values shell Interactive scraping console
  startproject  Create new project
  version Print Scrapy version view Open URL in browser, as seen by Scrapy Use "scrapy  -h" to see more info about a command T:\>
阅读(2067) | 评论(0) | 转发(1) |
给主人留下些什么吧!~~