Public number: byte array

Hope to help you 🤣🤣

In the past two years, the Ministry of Industry and Information Technology has paid more attention to the privacy, compliance and security issues of applications. The control degree of Android platform is much stricter than that of IOS platform, and many non-compliant applications have been removed from the shelves for rectification

To avoid privacy compliance and security problems, the following two points are the most important:

  • Privacy data cannot be collected before users agree to privacy agreements. For example, privacy data such as Android ID, Device ID and MAC cannot be obtained without the user’s consent
  • After the user agrees to the privacy agreement, the collection of user privacy data should not exceed the minimum frequency necessary to implement the service scenario. For example, some applications report the Android ID of the current device as a header for each network request. If the Android ID is not cached, the frequency of collecting this data will be very high, and privacy compliance issues will also exist

There are different solutions to these two problems

  • On the first point. It is necessary to count all codes related to privacy behaviors in the whole project, and judge whether the privacy behaviors are reasonable and will be triggered before users agree to the privacy agreement according to the business process. This requires a static scan of the entire project
  • On the second point. It is necessary to dynamically record the time point of triggering the privacy behavior and call chain information during the application running, judge whether the privacy behavior is carried out excessively according to the trigger time, and assist in judging which business is obtaining privacy data according to the call chain information. This requires dynamic recording of the application

Above measures if simple eye identification code by the developer and coding and the workload is very big and very not reality, because of a large project often introduce multiple dependent libraries and third-party SDK, we can regulate its own code, but can’t modify and effective constraint external dependencies, it is hard to manage internal logic and invocation chain relationship clearly dependent libraries. In addition, when privacy behavior is detected, a corresponding log report is output so that developers can troubleshoot problems during development

At this point, we can rely on ASM + Transform to deal with such non-business invasive development scenarios. This article will introduce how to use ASM + Transform to help pass privacy compliance security audit

Static scan

To perform static scanning, it is necessary to first clarify which operations are private. Here, Device ID and build. BRAND are obtained as examples. In the Activity, the behavior of obtaining private data is simulated by clicking a button. The subsequent dynamic recording requires recording the call chain from clicking the button to executing deviceutils.getDeviceid

package github.leavesc.asm.privacy_sentry

class PrivacySentryActivity : AppCompatActivity() {

    private val btnGetDeviceId by lazy {
        findViewById<Button>(R.id.btnGetDeviceId)
    }

    private val btnGetDeviceBrand by lazy {
        findViewById<Button>(R.id.btnGetDeviceBrand)
    }

    private val tvLog by lazy {
        findViewById<TextView>(R.id.tvLog)
    }

    override fun onCreate(savedInstanceState: Bundle?). {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_privacy_sentry)
        btnGetDeviceId.setOnClickListener {
            tvLog.append("\n" + "deviceId: " + DeviceUtils.getDeviceId(this))
        }
        btnGetDeviceBrand.setOnClickListener {
            tvLog.append("\n" + "brand: " + DeviceUtils.getBrand())
        }
    }

}

object DeviceUtils {

    fun getDeviceId(context: Context): String {
        return try {
            val telephonyManager =
                context.getSystemService(Service.TELEPHONY_SERVICE) as? TelephonyManager telephonyManager? .deviceId ?:""
        } catch (e: Throwable) {
            e.printStackTrace()
            ""}}fun getBrand(a): String {
        return Build.BRAND
    }

}
Copy the code

The above two privacy behaviors belong to different bytecode instructions at the bytecode level:

  • MethodInsnNode. Instruction belonging to the calling method, corresponding toTelephonyManager.getDeviceId()
  • FieldInsnNode. Belongs to the instruction that calls a member variable, corresponding toBuild.BRAND

What static scanning does is to iterate through all Class files in the compilation stage, and record the classpath where the private operation is located, the method signature information of the caller, and the signature information of the private operation into the text as long as the two bytecode instructions are detected in the file

Firstly, the target instruction is judged by comparing the signature information. For example, the classpath of build. BRAND is “Android.os. Build”, which is of type String, i.e. Ljava/lang/String; , the field name is BRAND, and the target instruction can be identified by comparing these three attributes

private fun AbstractInsnNode.isHookPoint(a): Boolean {
    when (this) {
        is MethodInsnNode -> {
            if (owner == "android/telephony/TelephonyManager"
                && name == "getDeviceId"
                && desc == "()Ljava/lang/String;"
            ) {
                return true}}is FieldInsnNode -> {
            if (owner == "android/os/Build" 
                && name == "BRAND" 
                && desc == "Ljava/lang/String;") {
                return true}}}return false
}
Copy the code

After the target instruction is found, the classpath where the private operation resides, the method signature information of the caller, and the signature information of the private operation are spliced together

private fun getLintLog(
    classNode: ClassNode,
    methodNode: MethodNode,
    hokeInstruction: AbstractInsnNode
): StringBuilder {
    val classPath = classNode.name
    val methodName = methodNode.name
    val methodDesc = methodNode.desc
    val owner: String
    val desc: String
    val name: String
    when (hokeInstruction) {
        is MethodInsnNode -> {
            owner = hokeInstruction.owner
            desc = hokeInstruction.desc
            name = hokeInstruction.name
        }
        is FieldInsnNode -> {
            owner = hokeInstruction.owner
            desc = hokeInstruction.desc
            name = hokeInstruction.name
        }
        else- > {throw RuntimeException("Illegal order")}}val lintLog = StringBuilder()
    lintLog.append(classPath)
    lintLog.append("- >")
    lintLog.append(methodName)
    lintLog.append("- >")
    lintLog.append(methodDesc)
    lintLog.append("\n")
    lintLog.append(owner)
    lintLog.append("- >")
    lintLog.append(name)
    lintLog.append("- >")
    lintLog.append(desc)
    return lintLog
}
Copy the code

During the Transform phase, every time a Class file is retrieved, it scans for target instructions, writes relevant information to a text file, and saves it to the desktop

class PrivacySentryTransform : BaseTransform() {

    private val allLintLog = StringBuffer()

    override fun modifyClass(byteArray: ByteArray): ByteArray {
        val classNode = ClassNode()
        val classReader = ClassReader(byteArray)
        classReader.accept(classNode, ClassReader.EXPAND_FRAMES)
        val methods = classNode.methods
        if(! methods.isNullOrEmpty()) {val tempLintLog = StringBuilder()
            for (methodNode in methods) {
                val instructions = methodNode.instructions
                valinstructionIterator = instructions? .iterator()if(instructionIterator ! =null) {
                    while (instructionIterator.hasNext()) {
                        val instruction = instructionIterator.next()
                        // Check if it is a target instruction
                        if (instruction.isHookPoint()) {
                            val lintLog = getLintLog(
                                classNode,
                                methodNode,
                                instruction
                            )
                            tempLintLog.append(lintLog)
                            tempLintLog.append("\n\n")}}}}if (tempLintLog.isNotBlank()) {
                allLintLog.append(tempLintLog)
            }
        }
        return byteArray
    }

    override fun onTransformEnd(a) {
        if (allLintLog.isNotEmpty()) {
            // Save the scan results to a text file
            FileUtils.write(generateLogFile(), allLintLog, Charset.defaultCharset())
        }
    }

    private fun generateLogFile(a): File {
        val time = SimpleDateFormat(
            "yyyy_MM_dd_HH_mm_ss",
            Locale.CHINA
        ).format(Date(System.currentTimeMillis()))
        // Save the file to the desktop
        return File(
            FileSystemView.getFileSystemView().homeDirectory,
            "PrivacySentry_${time}.txt"...)}}Copy the code

When the Transform finishes, a text file is generated on the desktop that records the two privacy actions in DeviceUtils. Core: Core :1.7.0, so TelephonyManagerCompat will also be scanned. You can see that the String getImei(TelephonyManager) method contained in this class also gets the Device ID

Note that since PrivacySentryTransform supports incremental compilation, not every compilation will output a log file, and only a full compilation will output a full recorded report

Dynamic record

Dynamic recording is to record the time point of triggering privacy behavior and call chain information each time when the application is running, judge whether the privacy behavior is excessive according to the trigger time, and assist to judge which specific business needs to obtain privacy data according to the call chain information. This requires us to not only read the Class file, but also modify it, inserting some log information and logic code

Taking the getBrand() method as an example, dynamic recording is to automatically generate log information when the method is called and write the log information to a file inside the application. The log contains the call time, the signature information of build.brand, and the call chain of the getBrand() method

object DeviceUtils {

    / / before pile
    fun getBrand(a): String {
        return Build.BRAND
    }

    / / after pile
    fun getBrand(a): String {
        val invokeTime = "xxx"
        val methodDesc = "xxx"
        val stackTrace = "xxx"
        writeToFile(invokeTime + methodDesc + stackTrace)
        return Build.BRAND
    }

}
Copy the code

The signature information for build.brand can be reused from the static scan, the timing of method calls can be determined when the information is written to the file, and the call chain for the getBrand() method is a bit more cumbersome

In the example code given in this article, the getBrand() method is directly triggered by A Button in the Activity. The hierarchy of the call chain is relatively small, but in reality, the call relationship is often not so simple, A calls B, B calls C, and C calls D. D then calls DeviceUtils, so it is not uncommon to chain-nest N files. It is therefore necessary to generate the call chain information for the getBrand() method

We can rely on Throwable to do this. We know that when an exception is thrown, the exception information contains the specific exception point and the chain of calls that caused the exception. We can follow this idea by actively generating a Throwable object in the getBrand() method and retrieving its stack information, thereby indirectly retrieving the call chain of the method

In the Transform phase, a writeToFile() method is automatically generated for DeviceUtils and executed before build.brand is called to implement dynamic recording

object DeviceUtils {

    / / before pile
    fun getBrand(a): String {
        return Build.BRAND
    }

}


object DeviceUtils {

    / / after pile
    fun getBrand(a): String {
        val methodDesc = "xxx"
        writeToFile(methodDesc, Throwable())
        return Build.BRAND
    }

    private fun writeToFile(log: String, throwable: Throwable) {
        val byteArrayOutputStream = ByteArrayOutputStream()
        throwable.printStackTrace(PrintStream(byteArrayOutputStream))
        val stackTrace = byteArrayOutputStream.toString()
        val realLog = log + stackTrace
        // Write the realLog to the text
        PrivacySentryRecord.writeToFile(realLog)
    }

}
Copy the code

With the idea of implementation, the subsequent coding is very simple, divided into two steps:

  • Automatically generated for each class that contains private behaviorwriteToFile(log: String, throwable: Throwable)methods
  • Each bytecode instruction representing private behavior is preceded by a method call instruction, which is responsible for making the callwriteToFilemethods

GenerateWriteToFileMethod method on the basis of static scan, to generate writeToFile method, every time after scanning to a target instruction, through insertRuntimeLog method to insert again call writeToFile method instruction

class PrivacySentryTransform(private val config: PrivacySentryConfig) : BaseTransform() {

    companion object {

        private const val writeToFileMethodName = "writeToFile"

        private const val writeToFileMethodDesc = "(Ljava/lang/String; Ljava/lang/Throwable;) V"

    }

    private val allLintLog = StringBuffer()

    override fun modifyClass(byteArray: ByteArray): ByteArray {
        val classNode = ClassNode()
        val classReader = ClassReader(byteArray)
        classReader.accept(classNode, ClassReader.EXPAND_FRAMES)
        val methods = classNode.methods
        if(! methods.isNullOrEmpty()) {val taskList = mutableListOf<() -> Unit> ()val tempLintLog = StringBuilder()
            for (methodNode in methods) {
                val instructions = methodNode.instructions
                valinstructionIterator = instructions? .iterator()if(instructionIterator ! =null) {
                    while (instructionIterator.hasNext()) {
                        val instruction = instructionIterator.next()
                        if (instruction.isHookPoint()) {
                            val lintLog = getLintLog(
                                classNode,
                                methodNode,
                                instruction
                            )
                            tempLintLog.append(lintLog)
                            tempLintLog.append("\n\n")

                            if(mRuntimeRecord ! =null) {
                                taskList.add {
                                    // Insert an instruction calling writeToFile before instruction
                                    insertRuntimeLog(
                                        classNode,
                                        methodNode,
                                        instruction
                                    )
                                }
                            }
                            Log.log(
                                "PrivacySentryTransform $lintLog")}}}}if (tempLintLog.isNotBlank()) {
                allLintLog.append(tempLintLog)
            }
            if(taskList.isNotEmpty() && mRuntimeRecord ! =null) {
                taskList.forEach {
                    it.invoke()
                }
                val classWriter = ClassWriter(ClassWriter.COMPUTE_MAXS)
                classNode.accept(classWriter)
                // Generate the writeToFile method
                generateWriteToFileMethod(classWriter, mRuntimeRecord)
                return classWriter.toByteArray()
            }
        }
        returnByteArray containing}...}Copy the code

Finally, every time two methods in DeviceUtils are executed, a privacySentry.txt file is automatically generated in the externalCacheDir directory, which records the exact time and chain of calls to the private method. Based on this invocation chain, we can quickly locate which piece of business is performing the sensitive operation

TODO

Static scan + dynamic record has been able to identify most of the privacy compliance issues, but it is not very safe, because with the change of the project, new privacy security issues may be introduced at any time, and if you have to go through the above process to check whether there are problems before each release, it is also very troublesome

The safest approach is to dynamically proxy privacy behavior through hooks. Our applications generally record whether the user has agreed to the privacy agreement through a flag bit. We can judge the flag bit before obtaining sensitive data every time. If the user has not agreed to the privacy agreement, the empty data will be returned directly, otherwise the operation will be performed. In this way, privacy compliance can be absolutely secure

object DeviceUtils {

    / / before pile
    fun getBrand(a): String {
        return Build.BRAND
    }

    / / after pile
    fun getBrand(a): String {
        if(! UserCache.serviceAgree){return ""
        }
        return Build.BRAND
    }

}
Copy the code

The source code

Finally also give the complete source code: ASM_Transform

For more examples of ASM in practice, see here:

  • ASM bytecode staking to achieve double – click anti – shake
  • ASM bytecode staking for thread consolidation