I am participating in the Mid-Autumn Festival Creative Submission contest, please see: Mid-Autumn Festival Creative Submission Contest for details
Data acquisition target
Web site:Meituan hotel
Results show
Please give me a thumbs up if you think it’s ok ~ thanks to everyone who read ❤
Tool use
Development tools: pycharm development environment: python3.7, Windows10 using toolkit: requests
Project idea analysis
First of all, select the scenic spots you want to travel to, and obtain the data around the scenic spots. For example, what you want to travel to Changsha during Mid-Autumn Festival and National Day is the hotel in Changsha.
After obtaining data, determine whether static data or dynamic data is loaded. The current web page data is dynamically loaded. Dynamic data needs to be obtained through packet capture
Once the data is determined, you can enter the code and send a network request to the url
url = 'https://ihotel.meituan.com/hbsearch/HotelSearch?utm_medium=pc&version_name=999.9&cateId=20&attr_28=129&uuid=73B01E7C105 1ACFB5730B1C1CD456776945CB50F887E09211F020EB1F6C89996%401631623416857&cityId=198&offset={}&limit=20&startDay=20210914&en dDay=20210914&q=&sort=defaults&X-FOR-WITH=4ArIbixslGNtC2oCAtBb2cXDcp3jPZ5xX01C1%2FuNnCXQqh0Edqc3Dkag7qJcicwPAPLY%2FljJLm 6wQMNIxLvp9b%2BYD0zfZCkSVgXL0zJuhuGKUZIaSNcfRtkjSISQqvXCOBdIJU9o2Kiz1YxsEqKX%2BlSNhmge6otjb%2B%2FQSr5lMWEicjgDCcQNg0jLrk AO1WXcFHMYZO40i6QdyAWbmxLV6TnJetfiLBxM0oQEvvcnOyA%3D'.format(i * 20) headers = { 'Host': 'ihotel.meituan.com', 'Origin': 'https://hotel.meituan.com', 'Referer': 'https://hotel.meituan.com/', 'Cookie': 'the uuid = 3 c6a1ffa63c44f609095. 1631265937.1.0.0; _lxsdk_cuid=178d04766dac8-08045ec98b749f-3f356b-1fa400-178d04766dbc8; mtcdn=K; IJSESSIONID=node01wdulauewgz8610kyb50ik2ee51588346; iuuid=73B01E7C1051ACFB5730B1C1CD456776945CB50F887E09211F020EB1F6C89996; _lxsdk=73B01E7C1051ACFB5730B1C1CD456776945CB50F887E09211F020EB1F6C89996; backurl=http://i.meituan.com/awp/h5/hotel/list/list.html?cityId=96&accommodationType=1&checkIn=2021-09-14&checkOut=2021- 09-15; i_extend=Gempty; ci=1; rvct=1%2C96; cityname=%E5%8C%97%E4%BA%AC; _lx_utm=utm_source%3DBaidu%26utm_medium%3Dorganic; _lxsdk_s= 17be4558492-2b-59D-9d2 %7C%7C2', 'user-agent ': 'Mozilla/5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36'} Response = requests. Get (url, headers=headers) print(response.text)Copy the code
Remember to fetch the JSON data with the corresponding data request header
Select the desired data parameters from the data
Item = {} item['historySaleCount'] = data['historySaleCount'] item['cityName'] = data['cityName'] item['areaName'] = data['areaName'] item['name'] = data['name'] item['scoreIntro'] = data['scoreIntro'] # item['positionDescList'] = [i['text'] for i in data['forward'].get('positionDescList', ] item['poiTagList'] = [I ['text'] for I in data['poiTagList']] print(item)Copy the code
The extracted data is at the discretion of each of you and the data is saved in CSV forms
That’s it. Where do you want to go? But wherever you go, wash your hands and wear a mask!
I am white and white I, a program yuan like to share knowledge ❤️ if there is no contact with this piece of programming friends see this blog, found that can not program or want to learn, you can directly leave a message + private ME ~ [thank you very much for your likes, favorites, concerns, comments, three even support]