Install the network request module: Requests
pip install requests
Does it ring a bell? Remember NodeJS?
Simple test:
Enter the Requests module:
import requests
Copy the code
Get request:
response = requests.get("https://www.baidu.com")
print(response)
Copy the code
Results:
The request has been successfully requested. We can check what is in response in the editor:
Print the response. The text:
This is baidu home page content, but garbled, don’t worry, plus this step:
response = requests.get("https://www.baidu.com")
response.encoding = response.apparent_encoding
print(response.text)
Copy the code
OK! Is it simple?
Of course, Requests not only supports GET, but also POST, PUT, delete, and so on:
Headler, param, etc are also supported when using different request modes:
requests.get("https://www.baidu.com",headers=,params=)
Copy the code
This is essential to the network request framework!
We in 360 graphics interface, for example, paging request: which wallpaper. Apc. 360. Cn/index. PHP? C…
Of course, we can request the link directly via GET, or we can request it via POST and pass in parameters:
params = {
'c': 'WallPaperAndroid'.'a': 'getAppsByCategory'.'cid': 9.'start': 0.'count': 10
}
response = requests.post("http://wallpaper.apc.360.cn/index.php", params=params)
print(response.text)
Copy the code
Request result (JSON format) :
Parsing json:
json_data = json.loads(response.text)
print('errno=%s,errmsg=%s' % (json_data['errno'], json_data['errmsg']))
list = json_data['data']
print("count=" + str(len(list)))
Copy the code
Results:
Print log (string + concatenation) print log (string + concatenation)
Ok, so I’m done parsing json. What if I want to parse a web page? When I used Java to parse web pages a long time ago, I used a tool called Jsoup, I believe many students have used it, it directly according to XML format parsing, various nodes, element…
Python also has a similarly powerful web page parsing tool: BeautifulSoup. (Note: Python native comes with an XML-SAX, xmL-DOM parser, but you need to know how to use it!)
BeautifulSoup uses documentation
Advantages and disadvantages of BeautifulSoup:
BeautifulSoup(markup, “LXML “) is the second one we usually use when parsing web data.
BeautifulSoup: PIP install BS4
Simple test, take Baidu home page as an example:
from bs4 import BeautifulSoup
response = requests.get("https://www.baidu.com")
response.encoding = response.apparent_encoding
print(response.text)
soup = BeautifulSoup(response.text, "lxml")
title = soup.find(name='title').text Name: soup. Find ('title')
print(title)
Copy the code
Execution error:
Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?Copy the code
Solution:
Install virtualenv: PIP install virtualenv
PIP install LXML
Executing the py program again results in: