Module
urllib2

An extensible library for opening URLs using a variety of protocols

The simplest way to use this module is to call the urlopen function, which accepts a string containing a URL or a Request object (described below). It opens the URL and returns the results as file-like object; the returned object has some extra methods described below.

The OpenerDirector manages a collection of Handler objects that do all the actual work. Each Handler implements a particular protocol or option. The OpenerDirector is a composite object that invokes the Handlers needed to open the requested URL. For example, the HTTPHandler performs HTTP GET and POST requests and deals with non-error returns. The HTTPRedirectHandler automatically deals with HTTP 301, 302, 303 and 307 redirect errors, and the HTTPDigestAuthHandler deals with digest authentication.

urlopen(url, data=None)
basic usage is that same as original urllib. pass the url and optionally data to post to an HTTP URL, and get a file-like object back. One difference is that you can also pass a Request instance instead of URL. Raises a URLError (subclass of IOError); for HTTP errors, raises an HTTPError, which can also be treated as a valid response.
build_opener
function that creates a new OpenerDirector instance. will install the default handlers. accepts one or more Handlers as arguments, either instances or Handler classes that it will instantiate. if one of the argument is a subclass of the default handler, the argument will be installed instead of the default.
install_opener
installs a new opener as the default opener.

objects of interest: OpenerDirector --

Request
an object that encapsulates the state of a request. the state can be a simple as the URL. it can also include extra HTTP headers, e.g. a User-Agent.

BaseHandler --

exceptions: URLError-- a subclass of IOError, individual protocols have their own specific subclass

HTTPError-- also a valid HTTP response, so you can treat an HTTP error as an exceptional event or valid response

internals: BaseHandler and parent _call_chain conventions

Example usage:

import urllib2

# set up authentication info authinfo = urllib2.HTTPBasicAuthHandler() authinfo.add_password(realm, host, username, password)

proxy_support = urllib2.ProxyHandler({"http" : "http://ahad-haam:3128"})

# build a new opener that adds authentication and caching FTP handlers opener = urllib2.build_opener(proxy_support, authinfo, urllib2.CacheFTPHandler)

# install it urllib2.install_opener(opener)

f = urllib2.urlopen(http://www.python.org/)