Chinaunix首页 | 论坛 | 博客
  • 博客访问: 28126
  • 博文数量: 2
  • 博客积分: 30
  • 博客等级: 民兵
  • 技术积分: 35
  • 用 户 组: 普通用户
  • 注册时间: 2011-07-17 00:56
文章分类

全部博文(2)

文章存档

2013年(2)

我的朋友

分类: Python/Ruby

2013-06-20 13:23:46

什么是 Scrapy?

Scrapy 是一快速、高等级的屏幕抓取和网页抓取框架,用于抓取网站并从网页提取结构化数据。从数据挖掘到数据监控以及自动化测试,它都有广泛的应用。

特性

简单
Scrapy was designed with simplicity in mind, by providing the features you need without getting in your way

有效
Just write the rules to extract the data from web pages and let Scrapy crawl the entire web site for you

快速
Scrapy is used in production crawlers to completely scrape more than 500 retailer sites daily, all in one server

可扩展
Scrapy was designed with extensibility in mind and so it provides several mechanisms to plug new code without having to touch the framework core

可移植、开源、100% Python
Scrapy is completely written in Python and runs on Linux, Windows, Mac and BSD

自备动力
Scrapy comes with lots of functionality built in. Check of the documentation for a list of them.

良好的文档 & 良好的测试
Scrapy is extensively documented and has an comprehensive test suite with

1,500 watchers, 350 forks on Github ()
700 followers on Twitter ()
850 questions on StackOverflow ()
200 messages per month on mailing list ()
40-50 users always connected to IRC channel ()
一些公司提供 Scrapy 资讯和支持

依然不确定 Scrapy 是否是你想要的?查看下 。

使用 Scrapy 的公司

在大量生产环境中, Scrapy 每天被用来抓取成千上万的站点。这有一份 列表。

从哪开始?

从阅读 开始,然后 并跟随 。

阅读(1098) | 评论(0) | 转发(0) |
0

上一篇:没有了

下一篇:Objective-C 的类扩展(class extension)

给主人留下些什么吧!~~