1. Play a game

When I was a child, everyone should have played a game, the game is very simple, is to find the difference between the two pictures within the specified time to find the direct difference, even if the win, the faster the better, just like the following:



This one up here is easy to find, but what about this one down here?



Well, some people do say, “I can!” For example, on the variety show “The Brain,” these “sick” non-humans can



Anyway I can not, I also don’t believe you can read the article here ~ I only have the most vegetables brain

Desire for the strongest brain

The color blocks above are like the application we are testing. One wall is equivalent to the code of the master branch, and one wall is equivalent to the code of the Dev branch. What did Dev change? What’s the difference? What is the extent of the impact? What are we testing?

Theory, the comprehensive test coverage, we surely can guarantee, so we take down the first code: this is an involved various enumeration of order status, and behind each state has its business logic, and even a cross, if carried out in accordance with the cartesian product or orthogonal way to use case design and the cover, there are… Lots and lots of use cases



  • So do you really have enough time to cover it all?

Development: I changed some code, I’ll help you with the full regression later test: ok (*** bi~~ ***) What? Automation? Are you sure?



Test development to today, it seems that there is no automation, are embarrassed to call the test, the resume does not write some automation will not move, but automation is really the silver bullet of testing no, have done should be deeply touched, automation is a luxury:

* Original development

* Maintenance costs

* How to use

* Rational design of use cases

* Lag of new features

  • Again, are you sure you really covered the code under test? It is equivalent to each color block on the rubik’s cube wall. In fact, the black box test process largely depends on the experience of the tester, which is highly subjective, so it is likely to miss the test, and after the release of the problem will have to open and tear…

  • Maybe some friends will feel like this, someone told us the answer, that is, tell us the difference of the Rubik’s cube wall. Wouldn’t that give me a test point to focus on?

Yes, we can ask the developers to tell us what methods have been changed this time, and we can even analyze the code ourselves if we have access to the code. Yes, Ms. King!



So the question arises again, is the description of development necessarily correct and comprehensive for the above situation? Even if you develop code that accurately describes the changes, what about the other areas affected by the changes? The developer himself is not easy to confirm (otherwise also have to test what ~), the developer may secretly change the code without telling you



This is the time to aspire to such a “strongest brain.”

  • The difference can be seen at a glance (logic for this change)
  • You have in mind the range of impact of the difference (narrow down what needs to be tested)
  • A second scan shows which tests are covered (confirm test coverage)



To achieve a degree of precision testing

3. Brain composition

According to the above description, we can be roughly divided into three dimensions:

  • differentiation
  • Call chain
  • coverage

One by one:

3.1 Differentiation Analysis

Git will tell you what the difference is when you submit your code

Today we are going to introduce the AST(Abstract Syntax Tree), which is a tree representation of the abstract syntax structure of the source code. Each node represents a syntax structure

Different languages have different parsers, which read the source code as a string, parse it, and build a syntax tree, which is necessary for a program to complete compilation.

Let’s take a look at the Java compilation process, focusing on steps 1 and 2:



Here we take a simple Java object, parse it into an AST and see what it looks like



Because there are too many layers and too complex, I choose the attribute user to make a simple demonstration. As follows:



Each item contains the most comprehensive information, including name, line number, etc. For details, visit the online debugging siteastexplorer.net/Debug view

Now that we have all the code information, we can take that information and compare it to see where the code is different. (Of course there is a lot of noise reduction involved, such as comments, whitespace, business-neutral code get/set, etc.)

In practice, it would be more convenient to use JavaParser to generate and manipulate the AST.

The general flow logic is as follows

3.2 Call chain analysis

3.2.1 bytecode

When it comes to call chains, we have to talk about bytecodes. Here’s a quick look at Java bytecodes

Java code is run by compiling Java files into.class ending bytecodes via javac, which are then executed by the JVM. So in a bytecode file, you have enough metadata to parse all the elements of a class: class name, superclass name, methods, attributes, and Java bytecode (instructions);



Take the following source code for example:

1  public class AccurateTest {
2
3     private int a = 1;
4
5     public String add(int b){
6        return String.valueOf(a + b);
7    }
8 }
9
Copy the code

The javac -g accuratetest. Java command is used to compile the accuratetest. Java to a bytecode file, and then the javap -verbose accuratetest. class command is used to decomcompile the accuratetest. class to obtain the following information:

Classfile /Users/qinzhen/Documents/My/TrainingProject/calctest/src/test/java/AccurateTest.class Last modified 2021-7-15;  size 386 bytes MD5 checksum e67842e9b540c556d288c28b303298fb Compiled from "AccurateTest.java" public class AccurateTest  minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_SUPER Constant pool: #1 = Methodref #4.#19 // java/lang/Object."<init>":()V #2 = Fieldref #3.#20 // AccurateTest.a:I #3 = Class #21 // AccurateTest #4 = Class #22 // java/lang/Object #5 = Utf8 a #6 = Utf8 I #7 = Utf8 <init> #8 = Utf8 ()V #9 = Utf8 Code #10 = Utf8 LineNumberTable #11 = Utf8 LocalVariableTable #12 = Utf8 this #13 = Utf8 LAccurateTest; #14 = Utf8 add #15 = Utf8 (I)I #16 = Utf8 b #17 = Utf8 SourceFile #18 = Utf8 AccurateTest.java #19 = NameAndType #7:#8 // "<init>":()V #20 = NameAndType #5:#6 // a:I #21 = Utf8 AccurateTest #22 = Utf8 java/lang/Object { public AccurateTest(); // Constructor descriptor: ()V flags: ACC_PUBLIC Code: stack=2, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: aload_0 5: iconst_1 6: putfield #2 // Field a:I 9: return LineNumberTable: line 1: 0 line 3: 4 LocalVariableTable: Start Length Slot Name Signature 0 10 0 this LAccurateTest; public java.lang.String add(int); // Descriptor: (I)Ljava/lang/String; Code: // Code start stack=2, locals=2, args_size=2 0: aload_0 1: Field a:I 4: ilOAD_1 5: iadd 6: invokestatic #3 // Method java/lang/String.valueOf:(I)Ljava/lang/String; 9: ireturn LineNumberTable: // Line number table, set the above opcodes to line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 7 0 this LAccurateTest; 0 7 1 b I // local variable} SourceFile: "accuratetest.java"Copy the code

From the above information, we can intuitively see that the bytecode contains all the information required by Java operation, and the JVM has strict requirements for bytecode files, which must follow a fixed composition and order, and this feature is suitable for using the visitor mode to modify the bytecode files. Therefore, we will introduce the core technology stack of call chain generation — ASM

3.2.2 the ASM

For bytecode operation, ASM framework is selected here. ASM is a bytecode manipulation framework, which can carry out CRUD operation on bytecode.

The ASM API, based on the visitor pattern, provides us with a ClassVisitor, MethodVisitor, and FieldVisitor API that calls the visitField method whenever ASM scans a class field, Scanning into a class method calls back MethodVisitor, scanning into a class annotation calls back AnnotationVisitor, etc.;

For the information in the method body, we can read and insert bytecode through the visitXXXXInsn() method provided by MethodVisitor. For example, we used the visitMethodInsn method to filter and extract the call information in the method body when doing call chain analysis

By matching Bridges with the above information, we can get a series of parent nodes in the call chain to form our method call chain

Of course, here we also need some noise reduction, excluding the link get/set, two-party packet, three-party packet, toString, init and other methods irrelevant to business analysis, so that the call chain link focus on the core business, otherwise it will be complicated like an endless spider web, practical greatly reduced ~

The general process logic is as follows:

3.3 Coverage Statistics

Speaking of coverage statistics, I would like to introduce jacoco, the open source tool that currently dominates the technology space

jacocoIt’s basically the same as loading an elephant, three steps

  • 1. Bytecode staking of the tested item
  • 2. Collection and export of coverage data
  • 3. Statistics and report generation of coverage data

Let’s break down each of these steps

3.3.1 Bytecode staking

Jacoco staking actually uses bytecode technology, so you can see the power of bytecode technology

Staking, in fact, is to install a monitoring probe, our lines of code is like a road, the branches (if-else) in the code is like a variety of branches on the road, and staking is equivalent to installing a probe at the intersection of each road



The following is an illustration of inserting probe information into the bytecode:



Jacoco has two modes of piling:

  • on-the-flyMode (running piling)
    • By configuring-javaagentIn the start command,jacocoStep into the deployment process of the project under test by inserting probes into the class file. The probes do not change the behavior of the original method, but only record whether it has been executed.
    • Advantages: No advance bytecode staking, no classpath setup.
    • Disadvantages: Modifying JVM parameters requires high environment requirements and is not applicable to scenarios where startup commands cannot be modified.
  • offlineMode (piling at compile time)
    • Before the test, the file is staked to generate the staked class or JAR package, the test class and JAR package is staked to generate the coverage information into the file, and finally unified processing to generate the report.
    • Advantages: The dependency of the tool on the VM environment is shielded.
    • Disadvantages: Need to hack into the code in advance; Coverage cannot be captured in real time, but reports can be generated uniformly after the project is stopped after the test is completed

Option: Considering the actual usage scenario of our company, the coverage rate needs to be counted in real time andOn-the-flyThe way does not need to invade the application startup script, coupled with the company’s operation and peacekeeping development can be deployedJavaAgentAs well asjvmStart parameter, so we finally chooseOn-the-flyMode for pile insertion

3.3.2 Collecting and exporting coverage

Saw the above principles, presumably coverage collection also is easy to understand, is still in surveillance cameras, for example, when we test asm code is equivalent to driving a car running in the whole road, and each into a line of code is like driving into a road, then enter the surveillance cameras will be recorded and will know which way you ran.

Similarly, when a line of code is overwritten, the probe records information and eventually knows which line of code has been overwritten



As for exporting, coverage statistics are retrieved through the exposed service port (6300 by default) and exported as a copy.execFile at the end, which contains the current coverage information

3.3.3 Statistics and report generation of coverage data

Jacoco statistics coverage, which still uses bytecode technology, also uses ASM

By parsing the exec file,jacocoYou can get probe information for all methods to calculate coverage and dye the code to output a report:

The coloring for the code is as follows

  • Red: not covered
  • Yellow: represents partial coverage,
  • Green: full coverage

In a real usage scenario, we might be more concerned with the modified code. In the test, we will focus on the new and changed scope of this round of development. Therefore, the jacoco native features will not be satisfied, and the jacoco native statistics are full coverage.

For the change point, let’s call itThe incremental, so we redeveloped the jacoco source code to support incremental coverage statistics for daily testing needs; Compared with the full range above, we can see that the statistical range of the increment is clear, and the quantity is much smaller:

  • The general architecture logic is as follows:

4. Future prospects

At present, all the above is for single application analysis and statistics. Nowadays, the software architecture is more and more complex, and with the popularity of micro-services, the interaction between application services is also more and more large. Therefore, the call link between cross-applications is also the focus of our attention.

If you modify a method or an interface, the interface may be called by N applications. Once the interface has a problem, the impact is quite large. Or the interface itself is no problem, but the upstream and downstream is not compatible, the call out of the problem is also affecting the product quality; So that’s what we’re going to focus on.

Furthermore, a large proportion of our daily testing is interface testing, including automation, and there are many interface automation use cases. If the upper-level entry interface (HTTP, Dubbo, etc.) affected by this modification can be found through the call link, then the use case that must be implemented in this round of modification can be recommended through the association between the interface and the use case, so as to improve the accuracy of the use case and a clearer test range.

Also, if the changed interface does not have an associated use case, or if the coverage is not up to standard after the use case is executed, the use case can be checked for leakage and a new use case can be added to cover the use case.

Based on the cross-application call chain analysis, our current research and communication with the middleware team of the company, we decided to carry out secondary development based on Skywalking, acquire the call relationship between applications through staking monitoring, and finally form a link with a single application link

  • Advantages: the scheme is relatively mature, the industry has landed cases, the difficulty of implementation is acceptable
  • Disadvantages: the link is also monitored by the peg, so the premise is that the link will exist only when it comes to it, so there is a lag, the new code link has not been tested, then the link will not be available

I’m not omnipotent

Here are a few questions:

  • 1. If I have 100% code coverage, do I have complete test coverage and quality assurance?

    • Answer: No, the coverage rate is low, and the quality must not be guaranteed, but the coverage rate is high, which is only one dimension of guarantee. Here we only know that the code is covered, but what about the correctness of the code logic? Precision is impossible to judge, it’s up to you to assert yourself. Furthermore, the code covered is written according to the business logic that the developer understands. What if he misses some requirement logic? So there’s no coverage here.
  • 2. Do I have to ensure 100% coverage for all methods every time?

    • A: No, it is not easy to directly determine what value the method coverage should achieve. Some code logic, such as the capture of some exceptions, the triggering scenario of this exception is very difficult, the daily test can hardly walk, then the coverage can not be covered, the coverage can not reach 100%.
  • 3. According to Question 2, since it cannot reach 100%, shall I set a threshold, such as 80%? 90%? If you reach that threshold, right?

    • A: No, for some methods, the code logic may be the core logic, which branches need to be covered, the lack of the risk of Bug detection, and theoretically can be covered by the test, then this method needs to achieve 100% coverage.
  • 4. How to measure the coverage rate?

    • A: On the one hand, you can set a minimum threshold value. Even if some logic of the code does not follow, it will not be large and the proportion is very high, and still need a minimum coverage guarantee.

    In addition, students who need to be tested will be divided according to their own testing business and have the ability and habit of CodereView. The platform is only used as a tool to assist the testing. Finally, we can record the coverage rate of previous tests and make statistics on the trend of coverage rate of different services after passing the tests. Set thresholds or monitor alarms based on the historical coverage data. If the coverage rate is lower than the normal value in the past, an alarm or a block is generated

6. Reference documents

Attached here are some reference documents of this paper and some documents for help in the process of learning precision testing. The documents are more than listed and involve more, so you can search by yourself if necessary

  • AST:

Juejin. Cn/post / 684490… Testerhome.com/topics/2381…

  • ASM:

Here I recommend meituan’s article “Exploration of Bytecode Enhancement Technology”. The explanation is very detailed and in place, and the core method of our project is generally consistent with what is written in the article

Tech.meituan.com/2019/09/05/…

Zhuanlan.zhihu.com/p/94498015 www.jianshu.com/p/905be2a9a… www.jianshu.com/p/26e99d39b… Jueee. Making. IO / 2020/08/20 2… Cloud.tencent.com/developer/a… www.jianshu.com/p/88be1658f…

  • Jacoco related

Segmentfault.com/a/119000002… Testerhome.com/topics/2063… Testerhome.com/topics/1692… Testerhome.com/topics/2212… Blog.csdn.net/tushuping/a… Secondary development: Incremental code coverage tools

  • Solution and architecture design:
    • There are great:

    Good precision testing practice

    • Content:

    Code coverage principle and acquisition APP practice

    • Dada Group:

    Coverage lightweight architecture building practices

    • Cool knorr:

    Coverage platform development practices