On the afternoon of May 20th, Shell House announced that Mr. Zuo Hui, the founder and chairman of the company, died of lung cancer on May 20th, 2021. Lung cancer is one of the most dangerous diseases for people’s life and health, so we all need to take good care of our bodies.

We all know he’s a real estate agent magnate, shells, links are everywhere in our daily life. I remember that the company had a project to collect all kinds of second-hand housing data, including shell and homelink. Today we will use Python to collect some data, to share with you, but also for some people who want to buy second-hand homes.

Here is a good way to use proxy to solve crawler. In the past, many people use API to obtain proxy, and it takes time to manage IP pool, which is a waste of time for people who need crawler data and have limited time. Recently, we found a new way called dynamic forwarding mode of crawler proxy, which I introduced to use through a Yiniuyun. After using it for a period of time, IT feels much faster than the previous way. So share it with everyone and try it out.

Next, let’s share a complete code sample to collect links:

#! -* -encoding: UTF-8 -* -import requests import random # Target page targetUrl = "http://www.. Lianjia.com "# target HTTPS page to visit # targetUrl = "https://httpbin.org/ip" # proxy server (product website www.16yun.cn) proxyHost = "t.16yun.cn" ProxyUser = "username" proxyPass = "password" proxyMeta = "http://%(user)s:%(pass)s@%(host)s:%(port)s" % { "host" : proxyHost, "port" : proxyPort, "user" : proxyUser, "pass" : ProxyPass,} # Set HTTP and HTTPS access to all proxies using HTTP proxies = {" HTTP ": proxyMeta," HTTPS ": } # set headers = random. Randint (1,10000) headers = {" proxy-tunnel ": str(tunnel)} resp = requests.get(targetUrl, proxies=proxies, headers=headers) print resp.status_code print resp.textCopy the code