It is better to teach a man to fish than to give him fish

Thousands of crawler tutorials, always feel market tutorials rarely teach the essence. In this installment, I will do a local scan of the crawler login to get Session.

Our goal is to be able to scan QQ music login in local execution. That is, save the login TWO-DIMENSIONAL code to the local, pop up the two-dimensional code, if the login successfully delete the two-dimensional code, retain the login information.

We first write display two-dimensional code function, delete two-dimensional code function, save two-dimensional code function.

Write the code

Import sys import OS import subprocess "" def showImage(img_path): try: if sys.platform.find('darwin') >= 0: subprocess.call(['open', img_path]) elif sys.platform.find('linux') >= 0: subprocess.call(['xdg-open', img_path]) else: os.startfile(img_path) except: From PIL import Image img = image.open (img_path) img.show() img.close() "" When verification is complete close the verification code and remove" "def removeImage(img_path): if sys.platform.find('darwin') >= 0: Os. system(" osAScript -e 'quit app \"Preview\" ") os.remove(img_path) "def saveImage(img, img_path): if os.path.isfile(img_path): os.remove(img_path) fp = open(img_path, 'wb') fp.write(img) fp.close()Copy the code

Grab the qr code download link

After entering THE QQ space, open the F12 developer tool and click the login button to pop up the login box.

Let’s first get our image information, click on the Img option and scroll down to find the url of the QR code.

Click on Headers to see what link you need to get the image:

  • The first is a GET Request (see Request Method)

  • Second URL ssl.ptlogin2.qq.com/ptqrshow (question mark…

Crawler Practical tutorial

It is better to teach a man to fish than to give him fish

Thousands of crawler tutorials, always feel market tutorials rarely teach the essence. In this installment, I will do a local scan of the crawler login to get Session.

Start of actual combat

The preparatory work

Our goal is to be able to scan QQ music login in local execution. That is, save the login TWO-DIMENSIONAL code to the local, pop up the two-dimensional code, if the login successfully delete the two-dimensional code, retain the login information.

We first write display two-dimensional code function, delete two-dimensional code function, save two-dimensional code function.

Write the code

Import sys import OS import subprocess "" def showImage(img_path): try: if sys.platform.find('darwin') >= 0: subprocess.call(['open', img_path]) elif sys.platform.find('linux') >= 0: subprocess.call(['xdg-open', img_path]) else: os.startfile(img_path) except: From PIL import Image img = image.open (img_path) img.show() img.close() "" When verification is complete close the verification code and remove" "def removeImage(img_path): if sys.platform.find('darwin') >= 0: Os. system(" osAScript -e 'quit app \"Preview\" ") os.remove(img_path) "def saveImage(img, img_path): if os.path.isfile(img_path): os.remove(img_path) fp = open(img_path, 'wb') fp.write(img) fp.close()Copy the code

Grab the qr code download link

After entering THE QQ space, open the F12 developer tool and click the login button to pop up the login box.

Let’s first get our image information, click on the Img option and scroll down to find the url of the QR code.

Click on Headers to see what link you need to get the image:

  • The first is a GET Request (see Request Method)

  • Second URL ssl.ptlogin2.qq.com/ptqrshow (question mark…

Image uploading failed

retry

Take a look at the parameters required by the TWO-DIMENSIONAL code website:

  • appid: 716027609

  • e: 2

  • l: M

  • s: 3

  • d: 72

  • v: 4

  • T: 0.07644951044008197

  • daid: 383

  • pt_3rd_aid: 100497308

To make sure it’s correct, we refresh it several times,

  • appid: 716027609

  • e: 2

  • l: M

  • s: 3

  • d: 72

  • v: 4

  • T: 0.7970151752745949

  • daid: 383

  • pt_3rd_aid: 100497308

We found that there was only one T parameter in the changed parameter, so we studied whether the T parameter could be accessed normally. Open the Postman tool, create a Requests query and add the URL and params to find a valid QR code.

Let’s assume that the t parameter is not an encryption parameter, except for random numbers between 0 and 1. T argument converts Python syntax to random.random()

Write the code

# # pseudocode self. Cur_path = OS. Getcwd () params = {' appid ':' 716027609 ', 'e', '2', 'l', 'M' and 's', '3', 'd', '72', 'v' : '4', 't': str(random.random()), 'daid': '383', 'pt_3rd_aid': '100497308', } response = self.session.get(self.ptqrshow_url, params=params) saveImage(response.content, os.path.join(self.cur_path, 'qrcode.jpg')) showImage(os.path.join(self.cur_path, 'qrcode.jpg'))Copy the code

Prepare to log in and capture packets

In order to prevent too many packages, we will remove the packages we have caught and click back to the ALL screen.

Click login jump, but at this time we need to check the status of the packet, because after you log in, there will be 302 jump phenomenon, if the packet capture is not stopped, the packet will be cleared after the jump.

The first thing we need to understand is what the two red buttons do

  • The upper left button controls the state of the browser’s packet capture. If it is gray, the browser will stop the packet capture and the number and location of the captured packets will not be empty.

  • The second button is to change the running rate of the browser. If the network speed is too fast to capture packets, we can change the sending rate of the front and back ends to slow 3G network speed, so that we can easily click to stop capturing packets. (This is for slow hand speed, like me)

We intercept these landing packages and search one by one for the main packages needed for landing. Landing on package only a URL for ssl.ptlogin2.qq.com/ptqrlogin parameters…

  • U1: graph.qq.com/oauth2.0/lo…

  • ptqrtoken: 1506487176

  • ptredirect: 0

  • h: 1

  • t: 1

  • g: 1

  • from_ui: 1

  • ptlang: 2052

  • action: 1-0-1607136616096

  • js_ver: 20102616

  • js_type: 1

  • login_sig:

  • pt_uistyle: 40

  • aid: 716027609

  • daid: 383

  • pt_3rd_aid: 100497308

Following multiple visits, we find that ptQrToken, Action, login_SIG are mutable. The third bit of the action variable is a multiple of the timestamp. Randomly open a timestamp url and throw in the variable parameter and find that it has been enlarged by a factor of 1000. The action variable is written in Python as ‘action’: ‘0-0-%s’ % int(time.time() * 1000)

Tricky variable encryption parameters

The first parameter

We open the developer window normally and are ready to find the encryption parameter location

Click on the Initiator dial, where we can find the source of each parameter and directly enter the first loadScript.

We find that we have a string of unformatted Javascript code. Open an online formatting site at will, after all the code formatting online query encryption parameters here is experienced what encryption.

params.ptqrtoken=$.str.hash33($.cookie.get("qrsig"))
pt.ptui.login_sig=pt.ptui.login_sig||$.cookie.get("pt_login_sig");
Copy the code

We have obtained the source of these two encryption parameters, which seems to be about the encryption of cookies.

  • The ptQrToken parameter needs to obtain the value information of the QrSIG key in the cookie and then undergo hash33 encryption.

  • The value of the PT_LOGin_SIG key in the cookie is required for the login_SIG parameter.

Now that we have the encryption location, let’s start looking for cookies. There aren’t many possible places for these two parameters to occur, and we don’t need to look at every return result.

  • One is that this parameter may appear when the login button is clicked and the popover appears.

  • One is that the two-dimensional code or QQ login information may appear when the parameter.

After refreshing, find the return message of the pop-up login box. A GET request with the URL xui.ptlogin2.qq.com/cgi-bin/xlo…

Parameters as follows:

  • appid: 716027609

  • daid: 383

  • style: 33

  • Login_text: authorizes and logs in

  • hide_title_bar: 1

  • hide_border: 1

  • target: self

  • S_url: graph.qq.com/oauth2.0/lo…

  • pt_3rd_aid: 100497308

  • Pt_feedback_link: support.qq.com/products/77…

To be sure, refresh several times to see if there are additional encryption parameters. Fortunately, fortunately, are normal dead parameters, good direct access.

Write the code

session = requests.Session() params = { 'appid': '716027609', 'daid': '383', 'style': '33', 'login_text': 'authorized and login', 'hide_title_bar' : '1', 'hide_border' : '1', 'target' : 'self', 's_url' : 'https://graph.qq.com/oauth2.0/login_jump', 'pt_3rd_aid', '100497308', 'pt_feedback_link' : 'https://support.qq.com/products/77942?customInfo=.appid100497308', } response = session.get('https://xui.ptlogin2.qq.com/cgi-bin/xlogin?', Params =params) cookie = session. Cookies print(cookie) ### # -> <RequestsCookieJar[# -> <Cookie pt_clientip=1c1e24098914080000b07d1bd433ca8b619275ff for .ptlogin2.qq.com/>, # -> <Cookie pt_guid_sig=f1d1eef00c25d5c6c6d8e2e991cb8b4f64bf619e97d242388d48887e4f0f93bf for .ptlogin2.qq.com/>, # -> <Cookie pt_local_token=49508773 for .ptlogin2.qq.com/>, # -> <Cookie pt_login_sig=BHH8t2gdwTlUjkRWg9xJ*vKp2v2-okQSrOV1q1QEyg*Z2uAbsqi18eiy*af*rvsb for .ptlogin2.qq.com/>, # -> <Cookie pt_serverip=8b6a647434394161 for .ptlogin2.qq.com/>, # -> <Cookie uikey=577ec007b515f37b7134decd61590dac2f03d036848870f20fe81c87cf7d7a95 for .ptlogin2.qq.com/>]>Copy the code

After running, we found the pT_login_SIG parameter and saved the named variable directly into the dictionary.

The second argument is 1. Get

Since the first parameter in the login box, then blind guess the second parameter should be saved in the TWO-DIMENSIONAL code. I have just got the code writing of the TWO-DIMENSIONAL code. Without saying a word, just take the cookie

Write the code

session = requests.Session()
params = {
    'appid': '716027609',
    'e': '2',
    'l': 'M',
    's': '3',
    'd': '72',
    'v': '4',
    't': str(random.random()),
    'daid': '383',
    'pt_3rd_aid': '100497308',
}
response = session.get('https://ssl.ptlogin2.qq.com/ptqrshow?', params=params)
cookie = session.cookies
print(cookie)
# -> <RequestsCookieJar[# -> <Cookie qrsig=4tlVhzwYo0FHzGeuen5Y-h5reR5cO*HjDyRQXcPedS*7MmOIYRENCN*BwY9JY1dD for .ptlogin2.qq.com/>]>
Copy the code

Qrsig = qrSIG = qrSIG = qrSIG = qrSIG = qrSIG = qRSIG = qRSIG

The second parameter hash33 is encrypted

The encryption parameter we get is not something that can be passed directly into the code, we have to get something that hash33 encrypts. Click Search to Search the hash33 query. There is only one information point to go in and look for the code.

Hash33 encryption algorithm

hash33: function hash33(str) {
    var hash = 0;
    for (var i = 0, length = str.length; i < length; ++i) {
        hash += (hash << 5) + str.charCodeAt(i)
    }
    return hash & 2147483647
}
Copy the code

Write as a Python program:

Def __decryptQrsig(self, qrsig): e = 0 for c in qrsig: e += (e << 5) + ord(c) return 2147483647 & eCopy the code

Here, all encryption is obtained, and session information can be obtained by accessing the URL.

All the code

import os,sys,time import subprocess import random import re import requests def showImage(img_path): try: if sys.platform.find('darwin') >= 0: subprocess.call(['open', img_path]) elif sys.platform.find('linux') >= 0: subprocess.call(['xdg-open', img_path]) else: os.startfile(img_path) except: from PIL import Image img = Image.open(img_path) img.show() img.close() def removeImage(img_path): if sys.platform.find('darwin') >= 0: os.system("osascript -e 'quit app \"Preview\"'") os.remove(img_path) def saveImage(img, img_path): if os.path.isfile(img_path): os.remove(img_path) fp = open(img_path, 'wb') fp.write(img) fp.close() class qqmusicScanqr(): is_callable = True def __init__(self, **kwargs): for key, value in kwargs.items(): setattr(self, key, value) self.info = 'login in qqmusic in scanqr mode' self.cur_path = os.getcwd() self.session = requests.Session() Self.__initialize () "def login(self, username=", password= ", crack_captcha_func=None, **kwargs): self.__initialize() "" def login(self, username=", password= ", crack_captcha_func=None, **kwargs): # set agent self. The session. Proxies. Update (kwargs. Get (' proxies' {})) # get pt_login_sig params = {' appid ':' 716027609 ', 'daid: '383', 'style' : '33', 'login_text' : 'authorization and login', 'hide_title_bar' : '1', 'hide_border' : '1', 'target' : 'self', 's_url' : 'https://graph.qq.com/oauth2.0/login_jump', 'pt_3rd_aid', '100497308', 'pt_feedback_link' : 'https://support.qq.com/products/77942?customInfo=.appid100497308', } response = self.session.get(self.xlogin_url, Params =params) pt_login_sig = self.session.cookies. Get ('pt_login_sig') params= {' appId ': '716027609', 'e': '2', 'l': 'M', 's': '3', 'd': '72', 'v': '4', 't': str(random.random()), 'daid': '383', 'pt_3rd_aid': '100497308', } response = self.session.get(self.ptqrshow_url, params=params) saveImage(response.content, os.path.join(self.cur_path, 'qrcode.jpg')) showImage(os.path.join(self.cur_path, 'qrcode.jpg')) qrsig = self.session.cookies. Get ('qrsig') ptQrToken = self.__decryptQrsig(qrsig) # Params = {' u1: 'https://graph.qq.com/oauth2.0/login_jump', 'ptqrtoken: ptqrtoken,' ptredirect ':' 0 ', 'h' : '1', 't' : '1', 'g': '1', 'from_ui': '1', 'ptlang': '2052', 'action': '0-0-%s' % int(time.time() * 1000), 'js_ver': '20102616', 'js_type': '1', 'login_sig': pt_login_sig, 'pt_uistyle': '40', 'aid': '716027609', 'daid': '383', 'pt_3rd_aid': '100497308', 'has_onekey': '1', } response = self.session.get(self.ptqrlogin_url, Params =params) print(response.text) if 'qr code not invalid' in response.text or 'in response. Text :pass elif' QR code invalid 'in response.text: raise RuntimeError('Fail to login, Qrcode has expired') else: Break time.sleep(0.5) removeImage(os.path.join(self.cur_path, 'qrcode. JPG)) # login success qq_number = re. The.findall (r' & uin = (. +?) & service ', response.text)[0] url_refresh = re.findall(r"'(https:.*?)'", response.text)[0] response = self.session.get(url_refresh, Allow_redirects =False, verify=False) print(' %s' logins' % qq_number ') return self. Hash33 def __decryptQrsig(self, qrsig): e = 0 for c in qrsig: E += (e << 5) + ord(c) return 2147483647 &e "" "def __initialize(self): self. 'the Mozilla / 5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36', } self.ptqrshow_url = 'https://ssl.ptlogin2.qq.com/ptqrshow?' self.xlogin_url = 'https://xui.ptlogin2.qq.com/cgi-bin/xlogin?' self.ptqrlogin_url = 'https://ssl.ptlogin2.qq.com/ptqrlogin?' self.session.headers.update(self.headers) qq_login = qqmusicScanqr() session = qq_login.login()Copy the code

Recently, many friends have sent messages to ask about learning Python. For easy communication, click on blue to join the discussion and answer resource base