Today, I will continue to share the Python crawler tutorial with you. This time, it mainly involves the reverse analysis and data capture of an APP. The reverse crawling of the APP will be troublesome. The APP also needs to be checked shell decompilation and other operations.
Next, xiaobian will show you how to reverse capture APP data, and give you some ideas for reference:
Equipment and environment required:
Device: Android phone
Caught: fiddler + xposed + JustTrustme
Check the shell: ApkScan – PKID
Shell: frida – DEXDump
Decompilation: jadX-GUI
Hook: frida
caught
Install app on your phone, set up proxy, open Fiddler, grab a bag, and find that the app has done certificate verification. After fiddler is opened, the app prompts that the server cannot be connected:
Just install the Xposed framework. There is a JustTrustme module in it. The principle of this module is hook, bypassing the certificate authentication class directly.
After opening the app, you can see that the package is successfully captured:
The formdata in the request body is ciphertext, and the response content is ciphertext. There is very little useful information in the request and response, and we even don’t know how to search in JADX-GUI. The formdata in the request body ends with two equal signs, which should be base64 encoding. Nothing else…
Decompilation
To decompile your app, use the shell check tool to check whether the app is hardened. Open apkScan-pkid and drag the app in:
It can be seen that this app uses 360 reinforcement, what a layer of restrictions!! You can download the frida-dexdump source code on Github. When you’re done, open the project folder and run the following command from the current location:
python main.py
Waiting for the unshell to complete, you can see that a corresponding folder is generated in the current project, which contains a number of dex files:
The following is to open the dex file with JadX-GUI, and generally search the keyword from the largest file in turn. As we know that base64 is used in Java, there is a BASE64Encoder keyword, because the information obtained from packet capture is very little, so we can only search this keyword here. The fourth dex was searched and the suspected encryption was obtained:
You can see that an AES encryption is used and the key is a fixed string
Frida Hook
Use frida to hook the contents of the input and output parameters of the encrypt function: Xiaobian himself is a Python development engineer, I spent three days to organize a set of Python learning tutorials, from the most basic Python scripts to Web development, crawler, data analysis, data visualization, machine learning, etc., these materials can be “click” to get the friends who want
At the same time to capture the packet comparison:
Here is the request data input parameter data:
PageIndex: indicates the current page number. PageSize: indicates the number of data items on the current page
The typeId and source are fixed, and then the hook decrypt function is used to decrypt a packet.
The answer is the same, and now we’re done doing the reverse analysis.
The data in the request body is encrypted and parameterized by encrypt function. If you change pageIndex, you can get the data on each page. The response is encrypted and displayed by decrypt function. We just need to implement the AES encryption and decryption process in Python, and we can see from the decomcompiled Java code that the key is fixed: wxTdefGABcdawn12, no IV offset.
request
Directly on the code:
Run the code and get the data successfully:
It can be seen that data encryption is now very common, no matter a small APP has several data protection mechanisms, this time only involves Java layer encryption, next time will talk about the hook method of native layer encryption, Frida-RPC active call and reverse magic device Inspeckage application.
Finally, the above content is only for learning and communication. Xiaobian is a Python development engineer. I spent three days to arrange a set of Python learning tutorials, from the most basic Python scripts to Web development, crawlers, data analysis, data visualization, machine learning, etc. These materials can be “clicked” by the friends who want them