Using APIs with Python Requests Module
One of the most liked feature of the newly launched HackerEarth profile is the accounts connections through which you can boast about your coding activity in various platforms.
Github and StackOverflow provide their API to pull out various kinds of data. The API documentation of Github and StackOverflow can be found here.
But what do we use to communicate with these APIs?
HTTP is a painful task. Python includes a module called
but working with it can become cumbersome.
Here is a code by @kenneth himself distinguishing urllib2 and requests
So, this above code clearly distinguishes why we went for the requests module.
Installing Requests via pip is fairly simple, just run this in your terminal.
$ pip install requests
Making your first Request
First of all, you need to import requests
>>> import requests
Now let's make a GET requests to get Github's public timeline
>>> r = requests.get('https://github.com/timeline.json')
Now, we have Response object called r using which we can get all the information.
Requests’ simple API means that all forms of HTTP request are as obvious. For example, this is how you make an HTTP POST request:
>>> r = requests.post("http://httpbin.org/post")
Similarly the other HTTP request types: PUT, DELETE, HEAD and OPTIONS?
Now, let's consider the GitHub timeline again:
Requests will automatically decode content from the server. The text encoding guessed by Requests is used when you access r.text. You can find out what encoding Requests is using, and change it, using the r.encoding property:
There's an builtin JSON decoder on which we heavily rely on
Passing parameters with URLs
You might often need to pass parameters. If you were constructing the URL by hand, this data would be given as key/value pairs in the URL after a question mark, e.g. httpbin.org/get?key=val
You can see that the URL has been correctly encoded by printing the URL:
Let's take a similar use case from HackerEarth where we get the information
related to the repositories. Pagination in Github says that by default a requests that return
multiple items will be paginated to 30 items. But we can set a custom page size
and if you want to specify request a specific page you need to pass the ?page
parameter. The page numbering is
1-based and that omitting the
will return the first page.
So, the URL for requesting 100 items from second page might look like
Let's perform this via requests.
If you’d like to add HTTP headers to a request, simply pass in a dict to the headers parameter.
For example, we didn’t specify our content-type in the previous example:
Let's take the earlier example of fetching data from repositories. Most of the APIs require require access token for requesting data. The access token needs to be added to HTTP headers.
Response Status Codes
We can check the status codes for the response using:
If we made a bad request like
5XX, we can raise it with
We can view the server’s response headers using a Python dictionary:
Let's take the earlier repository example again. Github uses pagination in their API.
So, we parsed out the next url out of the headers:
But, requests has a intuitive way to do it.
>>> r.links['next'] 'https://api.github.com/users/500628/repos?page=2&per_page=10' >>> r.links['last'] 'https://api.github.com/users/500628/repos?page=6&per_page=10'
You can tell requests to stop waiting for a response after a given number of seconds with the timeout parameter:
Errors and Exception
In case of network problem (e.g. DNS failure, refused connection, etc), Requests will raise a
In the event of the rare invalid HTTP response, Requests will raise an
If a request times out, a Timeout exception is raised.
If a request exceeds the configured number of maximum redirections, a
TooManyRedirects exception is raised.
All exceptions that Requests explicitly raises inherit from
References - http://docs.python-requests.org/en/latest/