Small knowledge, big challenge! This paper is participating in theEssentials for programmers”Creative activities.

This paper has participated inProject DigginTo win the creative gift package and challenge the creative incentive money.

Review and

We all know that before we use Python for network programming, we always do the following steps:

  1. Learn basic knowledge of computer network, have basic knowledge of OSI layer, TCP/IP protocol layer
  2. Learn about network libraries supported by Python, such as the urllib module

In this installment, we’ll learn about the urllib3 module, which is more powerful than Python’s built-in urllib module

1. Overview of urllib3 module

What is the URllib3 module?

Urllib3 is a third-party Python library dedicated to HTTP clients that provides more network operations than the Python native library

You can see eight features of Urllib3 on the PyPI website

  • Thread safety Thread safety
  • Connection pooling Connection pool
  • Client-side SSL/TLS verification Client SSL/TLS verification
  • File Uploads with Multipart Encoding Upload File distribution encoding
  • Helper for Retrying requests and dealing with HTTP redirects assists in handling duplicate requests and HTTP relocations
  • Support for Gzip, Deflate, and Brotli Encoding
  • Proxy support for HTTP and SOCKS Supports HTTP and Sock proxies
  • Test coverage

You can see the urllib3 user manual in detail

Urllib3 Installation method

Method 1: Enter a value on the CLI

pip install urllib3
Copy the code

Method 2: Use Git Clone

git://github.com/urllib3/urllib3.git
python setup.py install
Copy the code

We also need to import urllib3 in our code

import urllib3
Copy the code

2. Urllib3 related methods

Urllib3 user manual shows the following common methods:

methods role
urllib3.PoolManager.request(req,url,[headers,field]..) Create PoolManager instance to send an HTTP request via Request and return an HTTPresponse object

🚩 urllib3. PoolManager. Reques () related fields

parameter role
req Request methods, such as POST and GET
header Request Header parameter
fields Query parameters that can be used for GET, HEAD, and DELETE requests, as well as form data PUT and POST requests
body Data in Josn format
timeout Set the Timeout period, either a specific time or urllib3.Timeout(connect,read) instance object
retries Retry the request, which can be specified as an integer. When set to False, redirection and retry requests are disabled. If this parameter is not specified, three retries and three redirects are performed by default
redirect When set to False, redirection is disabled but retry logic is reserved

🚩HTTPresponse object provides related properties

attribute role
status The HTTP status code is returned. For example, 200 succeeded, 500 server exception, 403 Client data exception
data Return HTTPresponse information (JOSN, binary)
header Return the header information of the HTTP request
auto_close Manages the io_TextIOWrapper interface, usually set to False

Urllib3 only provides a kind of error exception handling urllib3. Expections. NewConnectionError

3. urllib3 VS urllib

  1. Request way

    • Urllib3: by urllib. PoolMangaer. Request the request, you can specify the request
    • Urllib: request urllib.request.urlopen() directly, default is GET, when passed data is POST
  2. Cookie management

    • Urllib3: You cannot add cookies directly, you can only set cookies to headers
    • Urllib: Opener can be reconstructed using the request.build_opener method
  3. Setting a Proxy

    • Urllib3: Simply pass in the proxy server address with urllib.poolmangaer ()
    • Urllib: creates a proxy object through Request.proxyHandler () and reconstructs the opener object through Request.build_opener

4. Test the cat

Urllib3 = urllib3 = urllib3

  • Basic operation procedure

    1. Import the import urllib3 library
    2. Create a connection pool object with urllib3.poolmanger ()
    3. The connection pool object calls the Request method, passing in the request method and URL parameters
    4. Prints the returned information, such as the status code and response information
import urllib3

http = urllib3.PoolManager()

req = http.request("GET"."https://juejin.cn/user/211521683863847/posts")

print(Request status code:,req.status)
print("response:",req.data.decode("utf-8"))
Copy the code

conclusion

In this installment, we’ll take a look at the Python network programming library urllib3 and compare it to Python’s built-in urllib.

That’s the content of this episode. Please give us your thumbs up and comments. See you next time