Small knowledge, big challenge! This paper is participating in theEssentials for programmers”Creative activities.

This paper has participated inProject DigginTo win the creative gift package and challenge the creative incentive money.

Review and

The urllib module contains four modules: Request, parse, Error and RobotParse. The request module mainly initiates requests to HTTP.

We all know that there are eight main types of HTTP requests

  • GET request: Requests information about a specified page
  • POST request: Submits data request processing results to the server
  • HEAD request: Obtains the packet header
  • PUT request: Uploads new resources to the server
  • TRACE request: displays the request received by the server. It is mainly used for testing and diagnosis
  • CONNECT Requests: The HTTP1.1 protocol is reserved for proxy servers that can pipe connections
  • OPTIONS request: Allows you to view server-side performance
  • DELETE request: Asks the server to DELETE the specified page

Urllib. request Let’s go~

1. Basic network request

We can use the urlopen(URL,data=None) method provided by the urllib.request module directly

The urlopen method opens the URL resource

  • After the incoming url, returns an HTTP. Clinet. HTTPResponsed object

According to the returned HTTP. Clinet. HTTPResponsed object, there are two ways to read the resource

  • Directly call the read mode to read
  • Read with the help of the WITH text explorer

PS: Since the returned data is binary, decode is required to parse the data into UTF-8 format after reading

import urllib.request

Open the URL resource
re = urllib.request.urlopen("https://juejin.cn/post/7024105564426731551")

Option 1: Read 300 bytes of the resource in bytes
data = re.read(300)
Print the bytes
print(data.decode("utf-8"))

# Method 2: with the text manager to open reading resources
with re as f:
    print(f.read(300).decode("utf-8"))

Copy the code

2. GET request

GET request is one of the most common request methods in HTTP protocol. Its significant feature is that all the requested data is in THE URL

In the urllib.request module, there are several ways to simulate getting

The urllib.request module uses the urlopen() method

  • The urlopen() method defaults to the GET request method
  • Urlopen () sends the GET request parameters without using the data attribute, simply append the request parameters to the URL
 import urllib.request
 import urllib.parse

 # requested data
 param = urllib.parse.urlencode({"bid": "juejin_web"})
 url = "https://i.snssdk.com/slardar/sdk_setting?%s" % param
 Open the URL resource
 re = urllib.request.urlopen(url)

 Read resource 300 bytes by byte
 data = re.read(300)
 Print the bytes
 print(data.decode("utf-8"))
Copy the code

3. A POST request

POST requests typically place the requested data in the body. The URL does not display the requested data

  • Method 1: Use the urlopen() method for POST requests

    • The urlopen method requires the data property
    • The first argument to the urlopen method can be either a URL string or an urllib.request.request object
import urllib.request
import urllib.parse
import json


# requested data
param = urllib.parse.urlencode({
    "id_type": 2."client_type": 2608."sort_type": 2."cursor": "0"."limit": 2
})
param = param.encode("utf-8")
Open the URL resource
re = urllib.request.urlopen("https://api.juejin.cn/recommend_api/v1/article/recommend_all_feed?aid=2608&uuid=6977760184626628096",data=param)

Read resource 300 bytes by byte
data = re.read()
Print the bytes
print(json.dumps(data.decode("utf-8")))
Copy the code

  • Use the urllib.request.resquest method
import urllib.request
import urllib.parse
import json


# requested data
param = urllib.parse.urlencode({
    "id_type": 2."client_type": 2608."sort_type": 2."cursor": "0"."limit": 2
})
param = param.encode("utf-8")
url = "https://api.juejin.cn/recommend_api/v1/article/recommend_all_feed?aid=2608&uuid=6977760184626628096"
Open the URL resource
req = urllib.request.Request(url=url,data=param)
re = urllib.request.urlopen(req)

Read resource 300 bytes by byte
data = re.read()
print(re.status)
print(re.reason)
Print the bytes
print(json.dumps(data.decode("utf-8")))
Copy the code

4. Other requests

The urllib.request.resquest (),method attribute can be used directly to specify the request method to be passed in

param = "Data requested for update"
url = "Request address"
req = urllib.request.Request(url=url,data=param,method="PUT/DELETE/CONNECT")
Copy the code
  • PUT request: Generally used to update site content
  • DELETE request: usually used to DELETE a specified resource, usually open

conclusion

In this installment, we will learn and practice urllib.request for various HTTP requests.

That’s the content of this episode, please give your thumbs up and comments. See you next time