Special ``mechanize`` browser using the Zope Publisher HTTP handler.
There are no implemented interfaces.
default_features
(type:
list
)
['_redirect', '_cookies', '_referer', '_refresh', '_equiv', '_basicauth', '_digestauth']
default_others
(type:
list
)
['_http_error', '_http_request_upgrade', '_http_default_error']
default_schemes
(type:
list
)
['http']
handler_classes
(type:
dict
)
{'http': <class mechanize._urllib2_support.HTTPHandler at 0x416ab1ac>, '_seek': <class mechanize._urllib2_support.SeekableProcessor at 0x416a4f5c>, '_proxy': <class mechanize._auth.ProxyHandler at 0x4169c47c>, '_equiv': <class mechanize._urllib2_support.HTTPEquivProcessor at 0x416a4f2c>, '_gzip': <class mechanize._gzip.HTTPGzipProcessor at 0x41692c8c>, '_debug_redirect': <class mechanize._urllib2_support.HTTPRedirectDebugProcessor at 0x416ab0ec>, 'gopher': <class urllib2.GopherHandler at 0x406f02fc>, '_basicauth': <class mechanize._auth.HTTPBasicAuthHandler at 0x4169c5cc>, '_http_request_upgrade': <class mechanize._urllib2_support.HTTPRequestUpgradeProcessor at 0x416a477c>, 'file': <class urllib2.FileHandler at 0x406f026c>, '_cookies': <class mechanize._urllib2_support.HTTPCookieProcessor at 0x416a4f8c>, '_digestauth': <class mechanize._auth.HTTPDigestAuthHandler at 0x4169c65c>, 'ftp': <class urllib2.FTPHandler at 0x406f029c>, '_referer': <class mechanize._useragent.HTTPRefererProcessor at 0x41692cbc>, '_robots': <class mechanize._urllib2_support.HTTPRobotRulesProcessor at 0x416ab05c>, '_refresh': <class mechanize._urllib2_support.HTTPRefreshProcessor at 0x416ab11c>, '_redirect': <class mechanize._urllib2_support.HTTPRedirectHandler at 0x416a414c>, '_http_default_error': <class urllib2.HTTPDefaultErrorHandler at 0x406e2ecc>, '_unknown': <class urllib2.UnknownHandler at 0x406f023c>, '_debug_response_body': <class mechanize._urllib2_support.HTTPResponseDebugProcessor at 0x416ab0bc>, 'https': <class mechanize._urllib2_support.HTTPSHandler at 0x416ab1dc>, '_proxy_basicauth': <class mechanize._auth.ProxyBasicAuthHandler at 0x4169c5fc>, '_http_error': <class mechanize._urllib2_support.HTTPErrorProcessor at 0x416ab14c>, '_response_upgrade': <class mechanize._mechanize.ResponseUpgradeProcessor at 0x41692efc>, '_proxy_digestauth': <class mechanize._auth.ProxyDigestAuthHandler at 0x4169c68c>}
add_handler(handler)
add_password(url, user, password, realm=None)
add_proxy_password(user, password, hostport=None, realm=None)
back(n=1)
Go back n steps in history, and return response object.
n: go back this number of steps (default 1 step)
clear_history()
click(*args, **kwds)
See ClientForm.HTMLForm.click for documentation.
click_link(link=None, **kwds)
Find a link and return a Request object for it.
Arguments are as for .find_link(), except that a link may be supplied as the first argument.
close()
encoding()
error(proto, *args)
find_link(**kwds)
Find a link in current page.
Links are returned as mechanize.Link objects.
# Return third link that .search()-matches the regexp "python" # (by ".search()-matches", I mean that the regular expression method # .search() is used, rather than .match()). find_link(text_regex=re.compile("python"), nr=2)
# Return first http link in the current page that points to somewhere # on python.org whose link text (after tags have been removed) is # exactly "monty python". find_link(text="monty python", url_regex=re.compile("http.*python.org"))
# Return first link with exactly three HTML attributes. find_link(predicate=lambda link: len(link.attrs) == 3)
Links include anchors (), image maps (), and frames (,
All arguments must be passed by keyword, not position. Zero or more arguments may be supplied. In order to find a link, all arguments supplied must match.
If a matching link is not found, mechanize.LinkNotFoundError is raised.
text: link text between link tags: eg. this bit (as returned by pullparser.get_compressed_text(), ie. without tags but with opening tags "textified" as per the pullparser docs) must compare equal to this argument, if supplied text_regex: link text between tag (as defined above) must match the regular expression object or regular expression string passed as this argument, if supplied name, name_regex: as for text and text_regex, but matched against the name HTML attribute of the link tag url, url_regex: as for text and text_regex, but matched against the URL of the link tag (note this matches against Link.url, which is a relative or absolute URL according to how it was written in the HTML) tag: element name of opening tag, eg. "a" predicate: a function taking a Link object as its single argument, returning a boolean result, indicating whether the links nr: matches the nth link that matches all other criteria (default 0)
follow_link(link=None, **kwds)
Find a link and .open() it.
Arguments are as for .click_link().
Return value is same as for Browser.open().
forms()
Return iterable over forms.
The returned form objects implement the ClientForm.HTMLForm interface.
geturl()
Get URL of current document.
links(**kwds)
Return iterable over links (mechanize.Link objects).
open(url, data=None)
reload()
Reload current document, and return response object.
response()
Return a copy of the current response.
The returned object has the same interface as the object returned by .open() (or urllib2.urlopen()).
retrieve(fullurl, filename=None, reporthook=None, data=None)
Returns (filename, headers).
For remote objects, the default filename will refer to a temporary file.
select_form(name=None, predicate=None, nr=None)
Select an HTML form for input.
This is a bit like giving a form the "input focus" in a browser.
If a form is selected, the Browser object supports the HTMLForm interface, so you can call methods like .set_value(), .set(), and .click().
At least one of the name, predicate and nr arguments must be supplied. If no matching form is found, mechanize.FormNotFoundError is raised.
If name is specified, then the form must have the indicated name.
If predicate is specified, then the form must match that function. The predicate function is passed the HTMLForm as its single argument, and should return a boolean value indicating whether the form matched.
nr, if supplied, is the sequence number of the form (where 0 is the first). Note that control 0 is the first form matching all the other arguments (if supplied); it is not necessarily the first control in the form.
set_cookiejar(cookiejar)
Set a mechanize.CookieJar, or None.
set_debug_http(handle)
Print HTTP headers to sys.stdout.
set_debug_redirects(handle)
Log information about HTTP redirects (including refreshes).
Logging is performed using module logging. The logger name is "mechanize.http_redirects". To actually print some debug output, eg:
import sys, logging logger = logging.getLogger("mechanize.http_redirects") logger.addHandler(logging.StreamHandler(sys.stdout)) logger.setLevel(logging.INFO)
Other logger names relevant to this module:
"mechanize.http_responses" "mechanize.cookies" (or "cookielib" if running Python 2.4)
To turn on everything:
import sys, logging logger = logging.getLogger("mechanize") logger.addHandler(logging.StreamHandler(sys.stdout)) logger.setLevel(logging.INFO)
set_debug_responses(handle)
Log HTTP response bodies.
See docstring for .set_debug_redirects() for details of logging.
set_handle_equiv(handle, head_parser_class=None)
Set whether to treat HTML http-equiv headers like HTTP headers.
Response objects will be .seek()able if this is set.
set_handle_gzip(handle)
Handle gzip transfer encoding.
set_handle_redirect(handle)
Set whether to handle HTTP 30x redirections.
set_handle_referer(handle)
Set whether to add Referer header to each request.
This base class does not implement this feature (so don't turn this on if you're using this base class directly), but the subclass mechanize.Browser does.
set_handle_refresh(handle, max_time=None, honor_time=True)
Set whether to handle HTTP Refresh headers.
set_handle_robots(handle)
Set whether to observe rules from robots.txt.
set_handled_schemes(schemes)
Set sequence of URL scheme (protocol) strings.
For example: ua.set_handled_schemes(["http", "ftp"])
If this fails (with ValueError) because you've passed an unknown scheme, the set of handled schemes will not be changed.
set_password_manager(password_manager)
Set a mechanize.HTTPPasswordMgrWithDefaultRealm, or None.
set_proxies(proxies)
Set a dictionary mapping URL scheme to proxy specification, or None.
set_proxy_password_manager(password_manager)
Set a mechanize.HTTPProxyPasswordMgr, or None.
set_response(response)
Replace current response with (a copy of) response.
submit(*args, **kwds)
Submit current form.
Arguments are as for ClientForm.HTMLForm.click().
Return value is same as for Browser.open().
title()
Return title, or None if there is no title element in the document.
Tags are stripped or textified as described in docs for PullParser.get_text() method of pullparser module.
viewing_html()
Return whether the current response contains HTML data.