Getting started with Python (3) Network requests and parsing

Install the network request module: Requests

pip install requests

Does it ring a bell? Remember NodeJS?

Simple test:

Enter the Requests module:

import requests
Copy the code

Get request:

response = requests.get("https://www.baidu.com")
print(response)
Copy the code

Results:

The request has been successfully requested. We can check what is in response in the editor:

Print the response. The text:

This is baidu home page content, but garbled, don’t worry, plus this step:

response = requests.get("https://www.baidu.com")
response.encoding = response.apparent_encoding
print(response.text)
Copy the code

OK! Is it simple?

Of course, Requests not only supports GET, but also POST, PUT, delete, and so on:

Headler, param, etc are also supported when using different request modes:

requests.get("https://www.baidu.com",headers=,params=)
Copy the code

This is essential to the network request framework!

We in 360 graphics interface, for example, paging request: which wallpaper. Apc. 360. Cn/index. PHP? C…

Of course, we can request the link directly via GET, or we can request it via POST and pass in parameters:

params = {
    'c': 'WallPaperAndroid'.'a': 'getAppsByCategory'.'cid': 9.'start': 0.'count': 10
}
response = requests.post("http://wallpaper.apc.360.cn/index.php", params=params)
print(response.text)
Copy the code

Request result (JSON format) :

Parsing json:

json_data = json.loads(response.text)
print('errno=%s,errmsg=%s' % (json_data['errno'], json_data['errmsg']))
list = json_data['data']
print("count=" + str(len(list)))
Copy the code

Results:

Print log (string + concatenation) print log (string + concatenation)

Ok, so I’m done parsing json. What if I want to parse a web page? When I used Java to parse web pages a long time ago, I used a tool called Jsoup, I believe many students have used it, it directly according to XML format parsing, various nodes, element…

Python also has a similarly powerful web page parsing tool: BeautifulSoup. (Note: Python native comes with an XML-SAX, xmL-DOM parser, but you need to know how to use it!)

BeautifulSoup uses documentation

Advantages and disadvantages of BeautifulSoup:

BeautifulSoup(markup, “LXML “) is the second one we usually use when parsing web data.

BeautifulSoup: PIP install BS4

Simple test, take Baidu home page as an example:

from bs4 import BeautifulSoup

response = requests.get("https://www.baidu.com")
response.encoding = response.apparent_encoding
print(response.text)
soup = BeautifulSoup(response.text, "lxml")
title = soup.find(name='title').text  Name: soup. Find ('title')
print(title)
Copy the code

Execution error:

Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?Copy the code

Solution:

Install virtualenv: PIP install virtualenv

PIP install LXML

Executing the py program again results in:

Getting started with Python (3) Network requests and parsing

Related Posts

Data Structures (3) – Stack implementation

How to design high performance Elasticsearch Mapping

IO streams in Go