什么是 Scrapy?
Scrapy 是一快速、高等级的屏幕抓取和网页抓取框架,用于抓取网站并从网页提取结构化数据。从数据挖掘到数据监控以及自动化测试,它都有广泛的应用。
特性
简单
Scrapy was designed with simplicity in mind, by providing the features
you need without getting in your way
有效
Just write the rules to extract the data from web pages and let Scrapy
crawl the entire web site for you
快速
Scrapy is used in production crawlers to completely scrape more than
500 retailer sites daily, all in one server
可扩展
Scrapy was designed with extensibility in mind and so it provides
several mechanisms to plug new code without having to touch the framework
core
可移植、开源、100% Python
Scrapy is completely written in Python and runs on Linux, Windows, Mac and BSD
自备动力
Scrapy comes with lots of functionality built in. Check of the documentation for a list of them.
良好的文档 & 良好的测试
Scrapy is
extensively documented and has an comprehensive test suite
with
1,500 watchers, 350 forks on Github ()
700 followers on Twitter ()
850 questions on StackOverflow ()
200 messages per month on mailing list ()
40-50 users always connected to IRC channel ()
一些公司提供 Scrapy 资讯和支持
依然不确定 Scrapy 是否是你想要的?查看下 。
使用 Scrapy 的公司
在大量生产环境中, Scrapy 每天被用来抓取成千上万的站点。这有一份 列表。
从哪开始?
从阅读 开始,然后 并跟随 。
阅读(1098) | 评论(0) | 转发(0) |