Abstract: This paper briefly introduces the methods and steps of Sechunter mobile application privacy compliance detection, and the application of target detection technology in it.

This article is shared from Huawei Cloud community “Introduction to Mobile Application Privacy Compliance Detection and Application of Target Detection Technology”, author: Wolfrevo.

Summary:

Benefiting from the widespread use of mobile devices, mobile applications have boomed in recent years. Based on various sensors integrated with mobile devices, many mobile applications with rich functions have been developed, gathering a large amount of high-value user privacy data, including user identity information, geographic location information, account information and so on. While users enjoy the convenience brought by mobile applications, their privacy security is also seriously threatened. In this context, mobile application privacy compliance detection arises at the historic moment. This paper briefly introduces the methods and steps of Sechunter mobile application privacy compliance detection and the application of target detection technology in it.

1. Introduction to the background of mobile application privacy compliance testing

The privacy compliance detection of mobile applications can be divided into static detection scheme and dynamic detection scheme from the technical form. The following are brief introductions.

1.1 Static Detection

The static detection solution decompiles the installation package of mobile applications, and then uses static data flow, control flow analysis and other technologies to detect possible privacy leakage problems in mobile applications. In this area, the following tools are commonly used:

  • Apktool [1]: Decompile Android Apk, decompile resources and repackage Apk after making changes

  • Dex2jar [2]: Decompile Apk into Java source (classes.dex into jar files)

  • Soot [3] : Soot was originally Java optimization framework, which has been widely used in analyzing, optimizing and visualizing Java and Android applications.

  • Flowdroid [4]: Static stain analysis framework for Android based on IFDS algorithm

Using the above tools, developers can develop appropriate standard detection items to detect potential privacy leaks in applications.

1.2 Dynamic Detection

Dynamic detection scheme detects whether mobile applications contain privacy violations by running the sandbox of real mobile phones or simulators, monitoring the access of mobile applications to sensitive resources in the system, and analyzing the privacy policy statement of mobile applications. The application can be run manually or by UI automation.

1.2.1 Monitoring of sensitive behaviors

The runtime sensitive line monitors real-time monitoring applications’ access to privacy-sensitive data. There are two kinds of implementation: one is to add monitoring code directly in the source code. For example, add code directly to getLastLocation in the AOSP code to record API access behavior. The other is through hook scheme, without directly modifying the source code, but adding logical hooks when the system runs APP. When the APP calls specific sensitive API, it first jumps to hook function, and finally returns to call the original sensitive API. The hook function is responsible for recording the application’s API access behavior.

1.2.2 UI automation

Mobile application automation is the process of controlling mobile application UI interaction. Typical tools in this area include monkey [5], which performs random UI clicks and system-level events. Third-party UI automation tools: UIAutomator2 [6] and AndroidViewClient [7], which are implemented based on the system tool UIAutomator, can realize basic automatic UI test function programming.

2. Application of target detection technology in privacy compliance detection

Object detection in deep learning is mainly used to detect the categories and positions of objects in the view, as shown in the figure below. Currently, there are three major deep learning algorithms in the industry: YOLO [7], SSD [8] and RCNN [9].

Taking Faster RCNN as an example, the algorithm is an evolution of RCNN algorithm. Structurally, Faster RCNN integrates feature extraction, proposal extraction, bounding box regression(rect refine) and classification into a network. The comprehensive performance is greatly improved, especially in the detection speed. Faster RCNN is mainly divided into four main contents:

1. Convlayers. As a CNN network target detection method, Faster RCNN firstly uses a set of basic Conv + Relu +pooling layers to extract image feature maps. The feature maps are shared for subsequent RPN layer and full connection layer.

2. RegionProposal Networks. RPN network is used to generate region proposals. This layer determines which anchors are positive or negative by Softmax and then bounces box Regression to correct anchors for precise proposals.

3. The Roi Pooling. This layer collects the input feature maps and proposals, integrates these information, extracts the proposal feature maps, and sends them to the subsequent full connection layer to determine the target category.

4. Classification. The proposal feature maps were used to calculate the proposal category, and the final exact position of the detection box was obtained by rebounding box regression.

2.1 application point

In UI automation, there are often UI layouts that uiAutomator-based tools cannot recognize. There are two main reasons for this: 1. UI content is rendered from the entire image; 2. UI controls. Some user-written UI controls do not support barrier-free services, so UIAutomator cannot obtain the UI layout. At this point, using the UI image target recognition, you can determine the effective clickable area.

As shown in the figure above, in Sechunter UI automation, we need to get the link to the application’s privacy statement file and the corresponding “agree” and “disagree” locations. In the case that UI layout cannot be obtained by UIAutomator, target recognition can be carried out and clickable positions can be obtained from images, thus promoting the continued execution of UI automated tests.

2.2 Application of target detection technology

In model training, the main difficulty lies in data set collection. Sechunter’s solution is to first capture the preliminary data set using a traditional image processing scheme, where we use the salient area identification of the image processing domain. The key to this process is to have a validation module, in the case of the privacy statement link, that is, to verify that the content in the area after the jump is indeed a privacy statement. We use the LDA topic model to determine whether text content is a privacy policy. The verified samples were collected into the dataset, and then the annotated data were used for the first version of target recognition model training.

The trained model only uses traditional image processing to recognize successful images for learning. For unsuccessful images, we further use OCR. OCR can identify the text content and its position in the image. Combined with the target recognition model of the first stage, the result fusion can obtain more accurate clickable region results, and the fusion scheme is initially available at this time. With the accumulation of data sets, the detection results of the target detection model become more accurate. You end up using only target recognition schemes.

3, summary

Mobile application privacy compliance detection plays an important role in protecting personal information security. However, the automated detection ability of tools on the market is generally limited. Sechunter is actively exploring the area of automated privacy compliance detection and has conducted a number of cross-domain technology studies. This article introduces target recognition technology that helps automated tools identify clickable AREAS of the UI more quickly and accurately.

Huawei Cloud Vulnerability Scanning Service VSS Basic Edition free experience for a limited time >>>

Reference:

【 1 】 Apktool: ibotpeaches. Making. IO/Apktool /

[2] Dex2jar: github.com/pxb1988/dex…

[3] Soot: DISRUPt-os.github. IO/Soot /

[4] Flowdroid: blogs. Uni -paderborn.de/ SSE /tools/f…

[5] Monkey:developer.android.com/studio/test…

【6】 uiautomator2: github.com/openatx/uia…

[7] AndroidViewClient: github.com/dtmilano/An…

[8] YOLO3: github.com/ultralytics…

[9] SSD: github.com/amdegroot/s…

[10] FasterR-CNN: arxiv.org/abs/1506.01…

Click to follow, the first time to learn about Huawei cloud fresh technology ~