The python-requests-library-guide blog is zen Programming

The Requests library is used to make standard HTTP requests in Python. It abstracts the complexity behind the request into a nice, simple API so you can focus on interacting with the service and using the data in your application.

In this article, you’ll see some of the useful features that Requests provides, and how to customize and optimize them for different situations you might encounter. You’ll also learn how to use Requests effectively, and how to prevent requests for external services from slowing down your application.

In this tutorial, you will learn how to:

  • Send the request using the usual HTTP method
  • Customize your request headers and data, using query strings and message bodies
  • Check your request and response data
  • Send the request with authentication
  • Configure your requests to avoid blocking or slowing down your application

While I try to include as much information as possible to understand the features and examples included in this article, reading this article requires a basic understanding of HTTP.

Now let’s take a closer look at how to use requests in your application!

Begin to userequests

Let’s start by installing the Requests library. To do this, run the following command:

pip install requests
Copy the code

If you prefer to use Pipenv to manage Python packages, you can run the following command:

pipenv install requests
Copy the code

Once you install Requests, you can use it in your application. Import requests like this:

import requests
Copy the code

Now that you’re all set to complete, it’s time to begin the journey of using Requests. Your first goal is to learn how to make GET requests.


A GET request

HTTP methods, such as GET and POST, determine what action is attempted when an HTTP request is made. In addition to GET and POST, there are other common methods that you will use later in this tutorial.

One of the most common HTTP methods is GET. The GET method indicates that you are trying to GET or retrieve data from the specified resource. To send a GET request, call requests.get().

You can make a GET request to GitHub’s Root REST API by:

>>> requests.get('https://api.github.com')
<Response [200]>
Copy the code

A: congratulations! You made your first request. Let’s take a closer look at the response to the request.


The response

Response is a powerful object to check the result of the request. Let’s make the same request again, but this time store the return value in a variable so you can look closely at its properties and methods:

>>> response = requests.get('https://api.github.com')
Copy the code

In this example, you capture the return value of get(), which is an instance of Response, and store it in a variable named Response. You can now use Response to see all information about the result of the GET request.

Status code

The first piece of information you can get from Response is the status code. The status code displays the status of your request.

For example, a 200 OK status indicates that your request was successful, while a 404 NOT FOUND status indicates that the resource you are looking for could NOT be FOUND. There are many other status codes that can give you details about what is happening with your request.

By accessing.status_code, you can see the status code returned by the server:

>>> response.status_code
200
Copy the code

A.status_code return of 200 means that your request was successful and the server returns the data you requested.

Sometimes, you might want to use this information in your code to make a judgment:

if response.status_code == 200:
    print('Success! ')
elif response.status_code == 404:
    print('Not Found.')
Copy the code

Following this logic, if the server returns a 200 status code, your program will print Success! If the result is 404, your program will print Not Found.

Requests simplifies this process even further for you. If a Response instance is used in a conditional expression, it is evaluated to True if the status code is between 200 and 400, and False otherwise.

So you can simplify the previous example by rewriting the if statement:

if response:
    print('Success! ')
else:
    print('An error has occurred.')
Copy the code

Technical details: Because __ bool __() is an overloaded method on Response, the truth test works.

This means redefining the default behavior of Response to consider the status code when determining the truth value of the object.

Remember that this method does not verify that the status code is equal to 200. The reason is that other status codes in the 200 to 400 range, such as 204 NO CONTENT and 304 NOT MODIFIED, are also considered successful responses in the sense of the term.

For example, 204 tells you that the response was successful, but nothing is returned in the body of the following message.

So, in general, if you want to know if a request was successful, make sure to use this handy shorthand, and then process the response appropriately based on the status code if necessary.

Suppose you don’t want to check the status code of the response in an if statement. Instead, you want to throw an exception if the request is unsuccessful. You can do this using.raise_for_status() :

import requests
from requests.exceptions import HTTPError

for url in ['https://api.github.com'.'https://api.github.com/invalid'] :try:
        response = requests.get(url)

        # If the response was successful, no Exception will be raised
        response.raise_for_status()
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')  # Python 3.6
    except Exception as err:
        print(f'Other error occurred: {err}')  # Python 3.6
    else:
        print('Success! ')
Copy the code

If you call.raise_for_status(), an HTTPError exception will be raised for some status code. If the status code indicates that the request was successful, the program continues without raising the exception.

Further reading: If you are unfamiliar with F-strings in Python 3.6, I recommend using them, as they are a great way to simplify formatting strings.

You now know a lot about how to handle the status code of the response returned from the server. However, when you make a GET request, you rarely care only about the status code of the response. Often, you want to see more. Next, you’ll see how to view the actual data returned by the server in the response body.

Response content

The response to a GET request usually has some valuable information in the message body, called the payload. Using the properties and methods of Response, you can view the payload in various formats.

To view the content of the response in byte format, you can use.content:

>>> response = requests.get('https://api.github.com')
>>> response.contentb'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/ connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"ht tps://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/searc h/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https:// api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","follower s_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_u rl":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.gith ub.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"http s://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url" :"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{or g}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repos itory_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/reposito ries?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page ,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.git hub.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_ organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos {?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}" } 'Copy the code

While.content allows you access to the raw bytes of the response payload, you usually want to convert them to strings using character encodings such as UTF-8. When you access.text, response will do this for you:

>>> response.text'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/c onnections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"htt ps://api.github.com/search/code?q={query}{&page,per_page,sort,order}"... }} '" "Copy the code

Since an encoding is required to decode bytes to STR, if you do not specify it, the request will try to guess the encoding from the response header. You can also explicitly set the encoding with.encoding before accessing.text:

>>> response.encoding = 'utf-8' # Optional: requests infers this internally
>>> response.text'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/c onnections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"htt ps://api.github.com/search/code?q={query}{&page,per_page,sort,order}"... }} '" "Copy the code

If you look at the response, you’ll see that it’s actually serialized JSON content. To get the dictionary contents, you can use.text to get STR and deserialize it with json.loads(). However, an easier way to do this is to use.json() :

>>> response.json(){'current_user_url': 'https://api.github.com/user', 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}'... }} 'Copy the code

Json () returns values of dictionary type, so you can use key-value pairs to access values in objects.

You can do a lot of things with status codes and message bodies. However, if you need more information, such as metadata about the response itself, you need to look at the response header.

In response to the head

The response header can give you useful information, such as the content type of the response payload and the time limit for caching the response. To see these headers, visit.headers:

>>> response.headers{'Server': 'GitHub.com', 'Date': 'Mon, 10 Dec 2018 17:49:54 GMT', 'Content-Type': 'application/json; charset=utf-8',... }Copy the code

Headers returns a dictionary-like object that allows you to use the key to get the value in the header. For example, to see the Content Type of the response payload, you can go to content-Type:

>>> response.headers['Content-Type']
'application/json; charset=utf-8'
Copy the code

However, there is something special about this dictionary-like header object. The HTTP specification defines headers as case insensitive, which means we can access these headers without worrying about their case:

>>> response.headers['content-type']
'application/json; charset=utf-8'
Copy the code

You will get the same value whether you use ‘content-type’ or ‘content-type’.

You have now learned the basics of Response. You’ve already seen its most useful properties and methods. Let’s step back and see how your response changes when you customize GET requests.


Query string parameters

A common way to customize GET requests is to pass values through query string parameters in the URL. To do this using get(), pass the data to Params. For example, you can use GitHub’s Search API to find the Requests library:

import requests

# Search GitHub's repositories for requests
response = requests.get(
    'https://api.github.com/search/repositories',
    params={'q': 'requests+language:python'},)# Inspect some attributes of the `requests` repository
json_response = response.json()
repository = json_response['items'] [0]
print(f'Repository name: {repository["name"]}')  # Python 3.6 +
print(f'Repository description: {repository["description"]}')  # Python 3.6 +
Copy the code

You can modify the results returned from the Search API by passing the dictionary {‘q’ : ‘requests + language: python’} to the params parameter of.get().

You can pass params to get() as a dictionary or as a tuple list as you did earlier:

>>> requests.get(. 'https://api.github.com/search/repositories', ... params=[('q', 'requests+language:python')], ... ) <Response [200]>Copy the code

You can even pass bytes as a value:

>>> requests.get(. 'https://api.github.com/search/repositories', ... params=b'q=requests+language:python', ... ) <Response [200]>Copy the code

Query strings are useful for parameterizing GET requests. You can also customize your request by adding or modifying the header of the sent request.


Request header

To customize the request header, you can use the headers argument to pass a dictionary of HTTP headers to GET (). For example, you can change a previous search request by specifying the text-matched media type in Accept to highlight the matching search terms in the results:

import requests

response = requests.get(
    'https://api.github.com/search/repositories',
    params={'q': 'requests+language:python'},
    headers={'Accept': 'application/vnd.github.v3.text-match+json'},)# View the new `text-matches` array which provides information
# about your search term within the results
json_response = response.json()
repository = json_response['items'] [0]
print(f'Text matches: {repository["text_matches"]}')
Copy the code

Accept tells the server what content types your application can handle. Since you want to highlight matching search terms, you use application/vnd.github.v3.text-match + JSON, which is a proprietary Github Accept header with its content in a special JSON format.

Before you learn more about customizing requests, let’s broaden our horizons by exploring other HTTP methods.


Other HTTP methods

Other popular HTTP methods besides GET include POST, PUT, DELETE, HEAD, PATCH, and OPTIONS. Requests provides a method for each HTTP method with a similar structure to get() :

>>> requests.post('https://httpbin.org/post', data={'key':'value'})
>>> requests.put('https://httpbin.org/put', data={'key':'value'})
>>> requests.delete('https://httpbin.org/delete')
>>> requests.head('https://httpbin.org/get')
>>> requests.patch('https://httpbin.org/patch', data={'key':'value'})
>>> requests.options('https://httpbin.org/get')
Copy the code

Invoke each function to make a request to the Httpbin service using the corresponding HTTP method. For each method, you can view the response as before:

>>> response = requests.head('https://httpbin.org/get')
>>> response.headers['Content-Type']
'application/json'

>>> response = requests.delete('https://httpbin.org/delete')
>>> json_response = response.json()
>>> json_response['args']
{}
Copy the code

Each method returns a header, response body, status code, and so on. Next, you’ll learn more about the POST, PUT, and PATCH methods and how they differ from other request types.


The message body

According to the HTTP specification, POST, PUT, and less common PATCH requests pass their data through the message body rather than through the query string parameter. With Requests, you pass the payload to the corresponding function’s data argument.

Data receives dictionaries, tuple lists, bytes, or file-like objects. You need to tailor the data sent in the request body to the format specific to the service you interact with.

For example, if your request content type is Application/X-www-form-urlencoded, you can send form data as a dictionary:

>>> requests.post('https://httpbin.org/post', data={'key':'value'})
<Response [200]>
Copy the code

You can also send the same data as a list of tuples:

>>> requests.post('https://httpbin.org/post', data=[('key'.'value')])
<Response [200] >Copy the code

However, if you need to send JSON data, you can use JSON parameters. When you pass JSON data through JSON, Requests will serialize your data and add the correct Content-Type header for you.

Httpbin.org is a great resource created by Requests author Kenneth Reitz. It is a service that receives test requests and responds to the requested data. For example, you can use it to check basic POST requests:

>>> response = requests.post('https://httpbin.org/post', json={'key':'value'})
>>> json_response = response.json()
>>> json_response['data']
'{"key": "value"}'
>>> json_response['headers'] ['Content-Type']
'application/json'
Copy the code

You can see from the response that the server received the request data and headers when you sent the request. Requests also provides this information to you in a PreparedRequest.


Check your request

When you make a request, the Requests library prepares the request before it is actually sent to the target server. Request preparation includes things like validating header information and serializing JSON content.

You can check out PreparedRequest by visiting.request:

>>> response = requests.post('https://httpbin.org/post', json={'key':'value'})
>>> response.request.headers['Content-Type']
'application/json'
>>> response.request.url
'https://httpbin.org/post'
>>> response.request.body
b'{"key": "value"}'
Copy the code

By checking PreparedRequest, you can access all kinds of information about the ongoing request, such as the payload, URL, header information, authentication, and so on.

So far, you’ve sent many different types of requests, but they all have one thing in common: they’re unauthenticated requests to a public API. Many of the services you encounter will probably want you to authenticate in some way.


The authentication

Authentication helps services know who you are. Typically, you provide credentials to the server by passing data to Authorization headers or custom headers defined by the service. All of the request functions you see here provide an argument called auth that allows you to pass credentials.

One example API that requires authentication is GitHub’s Authenticated User API. This endpoint provides information about the authenticated user profile. To make Authenticated User API requests, you can pass your GitHub username and password as a tuple to get() :

>>> from getpass import getpass
>>> requests.get('https://api.github.com/user', auth=('username', getpass()))
<Response [200]>
Copy the code

If the credentials you passed to Auth in the tuple are valid, the request succeeds. If you try to make this request without credentials, you will see the status code 401 Unauthorized:

>>> requests.get('https://api.github.com/user')
<Response [401]>
Copy the code

Rqeuests apply the credentials using HTTP’s basic access authentication scheme when you pass the username and password to auth parameters as tuples.

Therefore, you can make the same request by passing explicit basic authentication credentials using HTTPBasicAuth:

>>> from requests.auth import HTTPBasicAuth
>>> from getpass import getpass
>>> requests.get(. 'https://api.github.com/user', ... auth=HTTPBasicAuth('username', getpass()) ... ) <Response [200]>Copy the code

While you don’t need to explicitly do basic authentication, you may want to use other methods for authentication. Requests provides additional authentication methods out of the box, such as HTTPDigestAuth and HTTPProxyAuth.

You can even provide your own authentication mechanism. To do this, you must first subclass AuthBase. Then, implement __call __() :

import requests
from requests.auth import AuthBase

class TokenAuth(AuthBase):
    """Implements a custom authentication scheme."""

    def __init__(self, token):
        self.token = token

    def __call__(self, r):
        """Attach an API token to a custom auth header."""
        r.headers['X-TokenAuth'] = f'{self.token}'  # Python 3.6 +
        return r


requests.get('https://httpbin.org/get', auth=TokenAuth('12345abcde-token'))
Copy the code

Here, your custom TokenAuth receives a token and then includes that token in the X-Tokenauth header in your request header.

The wrong authentication mechanism can lead to security holes, so unless your service needs a custom authentication mechanism for some reason, you’ll always want to use an authenticated authentication scheme like Basic or OAuth.

When considering security, let’s consider using Requests to handle SSL certificates.


SSL Certificate Verification

Security is important whenever the data you’re trying to send or receive is sensitive. The way to communicate securely with a site over HTTP is to establish an encrypted connection using SSL, which means verifying the SSL certificate of the target server is critical.

The good news is that Requests does this for you by default. However, in some cases, you may want to change this behavior.

If you want to disable SSL certificate validation, pass False to the request function’s verify argument:

>>> requests.get('https://api.github.com', verify=False)
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
<Response [200]>
Copy the code

Requests will even warn you when you make unsafe requests to help secure your data.


performance

When using Requests, especially in a production application environment, it is important to consider the performance impact. Features like timeout control, session and retry limits can help you keep your application running smoothly.

Timeout control

When you make a request to an external service, the system will need to wait for the response to continue. If your application waits too long for a response, it may block requests to your service, your user experience may suffer, or your background jobs may hang.

By default, Requests will wait indefinitely for a response, so you should almost always specify a timeout to prevent these things from happening. To set a timeout for a request, use the timeout parameter. Timeout can be either an integer or a floating-point number that represents the number of seconds to wait for a response before timeout:

>>> requests.get('https://api.github.com', timeout=1)
<Response [200]>
>>> requests.get('https://api.github.com'Timeout = 3.05)
<Response [200]>
Copy the code

In the first request, the request will time out after 1 second. In the second request, the request will time out after 3.05 seconds.

You can also pass a tuple to timeout, with the first element being the connection timeout (how long it will take the client to establish a connection with the server) and the second element being the read timeout (how long it will take to wait for a response once your client has established a connection) :

>>> requests.get('https://api.github.com', timeout=(2, 5))
<Response [200]>
Copy the code

If the request establishes the connection within 2 seconds and receives data within 5 seconds of establishing the connection, the response is returned as is. If the request times out, the function throws a Timeout exception:

import requests
from requests.exceptions import Timeout

try:
    response = requests.get('https://api.github.com', timeout=1)
except Timeout:
    print('The request timed out')
else:
    print('The request did not time out')
Copy the code

Your program can catch Timeout exceptions and respond accordingly.

The Session object

So far, you’ve been working with advanced request apis, such as GET () and POST (). These functions are abstractions of what happens when you make a request. So you don’t have to worry about them, they hide implementation details, such as how to manage connections.

Underneath these abstractions is a class called Session. If you need to fine-tune your control of requests or improve their performance, you might want to use Session instances directly.

Session is used to hold parameters across requests. For example, if you want to use the same authentication across multiple requests, you can use session:

import requests
from getpass import getpass

# By using a context manager, you can ensure the resources used by
# the session will be released after use
with requests.Session() as session:
    session.auth = ('username', getpass())

    # Instead of requests.get(), you'll use session.get()
    response = session.get('https://api.github.com/user')

# You can inspect the response just like you did before
print(response.headers)
print(response.json())
Copy the code

Each time a request is made using session, the credentials are retained once initialized with authentication credentials.

The primary performance optimization for sessions comes in the form of persistent connections. When your application establishes a connection to a server using a Session, it maintains that connection in the connection pool. When your application wants to connect to the same server again, it will reuse connections in the pool instead of making new connections.

Maximum retry

When a request fails, you might want the application to retry the same request. However, by default, Requests will not do this for you. To apply this feature, you need to implement a custom Transport Adapter.

With Transport Adapters, you can define a set of configurations for each service that you interact with. For example, suppose you want all requests to https://api.github.com to be retried three times before finally throwing a ConnectionError. You will build a Transport Adapter, set its max_retries parameter, and load it into an existing Session:

import requests
from requests.adapters import HTTPAdapter
from requests.exceptions import ConnectionError

github_adapter = HTTPAdapter(max_retries=3)

session = requests.Session()

# Use `github_adapter` for all requests to endpoints that start with this URL
session.mount('https://api.github.com', github_adapter)

try:
    session.get('https://api.github.com')
except ConnectionError as ce:
    print(ce)
Copy the code

When you mount HTTPAdapter(github_adapter) to a session, the session follows its configuration for each request to https://api.github.com.

Timeouts, Transport Adapters, and Sessions are used to keep code efficient and applications robust.


conclusion

You’ve come a long way in learning Python’s powerful Requests library.

You can now:

  • Requests are made using various HTTP methods, such as GET, POST, and PUT
  • Customize your request by modifying the request header, authentication, query string, and message body
  • Check the data sent to the server and the data the server sends back to you
  • Use the SSL certificate for authentication
  • Efficient userequestsThrough the use ofmax_retries.timeout.SessionsTransport Adapters

Because you’ve learned how to use Requests, you can use the fascinating data they provide to explore the wide world of Web services and build great applications.

Pay close attention to the public number < code and art >, learn more foreign high-quality technical articles.