I. Project Background
Qiongyou.com provides original and practical outbound travel guide, guidebook, travel community and q&A exchange platform, as well as intelligent travel planning solutions. It also provides online value-added services such as visa, insurance, air ticket, hotel reservation and car rental. Budget travel “encourages and helps Chinese travelers to experience the world in their own perspective and way”.
Today we will teach you to get the city information of the poor game net, using Python to write the data to the CSV file.
【 II. Project Objectives 】
The realization will obtain the corresponding city, image link, hot spot, batch download save CSV document.
[III. Libraries and Websites involved]
1. The website is as follows:
https://place.qyer.com/south-korea/citylist-0-0-{}
Copy the code
2. Libraries involved: Requests, l**** XML, FAke_userAgent, time, CSV
Iv. Project Analysis
First, you need to figure out how to request the next page url. You can click the button on the next page and observe the changes to the site as shown below:
https://place.qyer.com/south-korea/citylist-0-0-1
https://place.qyer.com/south-korea/citylist-0-0-2
https://place.qyer.com/south-korea/citylist-0-0-3
Copy the code
Citylist-0-0 -{}/;}/;}/;
V. Project Implementation
1. We define a class that inherits object, init that inherits self, and a main function that inherits self. The URL address is prepared.
2. Randomly generate UserAgent.
3. Multi-page requests.
4. Define get_page method to request data.
5. Define page_page, xpath parses the data, and for loops through the array.
In Chrome, right-click developer Tools or press F12. Right-click to check, and xpath finds the corresponding field of information. As shown in the figure below.
6. Define CSV files to save data and write documents.
7. Call the main method.
8, Time module, set the time delay.
[VI. Effect Display]
1. Click Run, enter the start page, and terminate the page.
2. Display the download success information on the console.
3. Save the CSV file.
【 VII. Summary 】
1. It is not recommended to grab too much data, which is easy to load the server.
2. I hope this project can help you to have a further understanding of CSV document processing.
3. This paper is based on Python web crawler and uses crawler library to achieve the acquisition of poor web. When realizing, there will always be a variety of problems, do not have high expectations and low hands, work frequently, can understand more profound.
4. You can choose your favorite city according to your own needs to get the effective information you want.
This article reprinted text, copyright belongs to the author, such as infringement contact xiaobian delete!
The original address: www.tuicool.com/articles/ze…
Need source code or want to learn more(Click here to download)