分类: Python/Ruby
2012-10-10 18:52:03
Stateful programmatic web browsing in Python, after Andy Lester’s Perl module.
mechanize.Browser and mechanize.UserAgentBase implement the interface of urllib2.OpenerDirector, so:
any URL can be opened, not just http:
mechanize.UserAgentBase offers easy dynamic configuration of user-agent features like protocol, cookie, redirection and robots.txt handling, without having to make a new OpenerDirector each time, e.g. by callingbuild_opener().
Easy HTML form filling.
Convenient link parsing and following.
Browser history (.back() and .reload() methods).
The Referer HTTP header is added properly (optional).
Automatic observance of .
Automatic handling of HTTP-Equiv and Refresh.
The examples below are written for a website that does not exist (example.com), so cannot be run. There are also some working examples that you can run.
import reYou may control the browser’s policy by using the methods of mechanize.Browser’s base class, mechanize.UserAgent. For example:
br = mechanize.Browser()