Code scanning | the code quality control tool

Jinch Pan — Director of CODING Products

Head of Tencent Cloud R&D platform, ten years of experience in r&d energy efficiency construction

Product owner for CODING scanning

A young man was smoking at the gate of his office building. A passer-by passed by and said, "Do you know that this stuff will endanger your health? Did you notice that Warning on the cigarette packet?" The guy said, "That's okay. I'm a programmer." The passer-by said, "So what?" The programmer replied, "We never care about Warning, only Error."Copy the code

Start with a laugh, this is a “primer” for those who rarely use/know code scanning tools. On the one hand, there are certain technical barriers in code scanning, which involves lexical/grammar analysis, compilation injection, pattern recognition and security, etc., so it may be difficult to understand the content in this aspect. On the other hand, there are still many misunderstandings about code scanning products and their fields in the public, which greatly affects the experience of using code scanning. Some people even directly equate Lint/Style with scan, which is ironic.

Since the open trial, CODING code scanning has provided scanning services for more than 5,000 teams, helping development teams timely find a large number of potential code defects, security vulnerabilities and non-standard codes. Through this article, I hope to take some common scenarios as examples to explain the value and use method of code scanning in an easy way, so as to help readers understand deeply and quickly get started, so that code scanning products can give full play to their value in helping enterprises build DevSecOps.

What is the value of code scanning

Leaving aside the cliche concepts of quality moving forward or quality built in, code scanning is often the second step in a team’s transition to DevOps from a practical perspective (the first step is continuous integration/pipeline). One is because the pipeline only runs compilation, packaging and deployment, which is slightly thin. The other is that the cost of access code scanning is the lowest compared to single-side, interface automation and E2E automation. Taking Jenkins as an example, the operation and maintenance only need to install SonarQube plug-in in Jenkins cluster and add a command in Jenkinsfile, which can complete the access of code scan without the intervention of developers.

In addition, developers with a bit of code culture will install plug-ins in the native IDE for local checks, and will be able to alert the IDE to syntax or style problems and even automatically fix them.

The things that are easier to get tend to be the most overlooked. IDE’s Auto inspect/format or pipeline silent execution tend to dilute the perception and value of code scanning: I have done Style check. I have completed the code check task, but the development and launch cycle is too tight. I will look at the scanning problems later, and the problems found in the code scan are not a big deal. In fact, the major software/Internet manufacturers spend millions of dollars every year to buy various scanning software licenses (SonarQube, Coverity, Checkmarx, etc.), and these companies are all the industry leaders with a valuation of over a billion (Sonar received 45 million USD financing in 2016, Coverity was acquired for $375 million in 2014). Huge market value and humble sense of existence, why can appear such a phenomenon? To figure this out, let’s first look at what code scanning can help us find.

0. Programming syntax problems

Item 0 is listed because I don’t think this is even a code scan problem. There are a number of ides and plug-ins that integrate syntax-checking capabilities to help developers check, prompt, and even automatically fix syntax problems during development, solving some code quality problems, but this is the responsibility of the syntax parser and has little to do with code scanning.

1. Code specification issues

Many readers may see this with an “I get it again” look, which is by far the most common perception left over from code scanning: checking for comments, indentation for Spaces or tabs, braces for next line or next line, and so on. Checks such as these can easily lead to controversy on the team, and because they don’t prevent functional logic from working correctly (not Warning, just Error), many attempts at code scanning end there.

However, do code specifications really matter?

Were you aware of the inconsistencies in return values in dynamic languages, and what might be troubling you later, if not explicitly warned here?

If there is no special warning here, do you realize that the input parameter is a mutable object, which might introduce some problems?

If the nesting of for, if, and try continues, how will this code be read?

After subsequent changes to the model fields, do you remember that you need to change multiple parts of the same code?

Code specification class scanning is the most effective way to solve the problem of “poisoning code”. For co-op projects as well, if you want to avoid “When I write this code, only God and I know what it means; After a month, God only knows “scenario, following a uniform code specification is also very necessary.

2. Functional defects

A lot of people take chances: “This is probably the only version of my code I’ll ever need to maintain, so I’ll be fine.” So have a code scan to help confirm, does your code really run?

Are you sure you can test all these null pointer problems?

How difficult is it to detect the array transgression problem through human CR?

Not to mention this memory leak problem, without tools to help you locate it, managing the memory still takes some work.

From this point of view, code scanning is equivalent to testing. It is an effective means to ensure the normal function of the application, and it can also discover more in-depth technical problems more efficiently.

3. Safety defects

Some readers may also think, “My function is very simple, a few clicks passed the test, nothing else”. In addition to satisfying users in terms of functions, an application also needs to guard against malicious companies. Marriott has been fined heavily for leaking user data. How sure can we be that it is not the next target?

Source: InfoQ Wanjia

An important entry point for drag libraries is SQL injection, and such problems can be easily found with code scanning tools.

Remote command execution is also a common way to overcome the target machine. Many common open source components have been exposed to similar problems. Are you sure you are more security-conscious than Apache?

There are also multiple attacks such as CSRF, XSS, XXE, deserialization, and so on. If every front-line programmer needs to be well aware of these and carefully avoid them, the cost of control will skyrocket. Code scanning can be used to quickly identify and locate risks, which can protect digital assets at the lowest cost. Static code analysis (SAST) is also one of the most basic and lowest-threshold checks in DevSevOps.

4. Public relations risk

“Stop joking, how can you talk about code quality and have a PR problem?” Stop laughing, let’s look at a news: Vivo’s camera lift: rogue software detector or IQ tester?

Source: Play

To put it simply, when an Android application retrives camera parameters, the function it calls may trigger the camera to go up, but the viewer doesn’t really get into the technical implementation details. Turning on the camera to secretly photograph the user was a public relations crisis in every sense of the term, and it also caused a storm that affected all sides. At the same time, Tencent also organized a set of sensitive API scanning program internally, through the code scanning tool to scan sensitive interfaces in the project, to remind developers to check and confirm themselves, to prevent greater risks.

How should code scan be used

You’ve probably come to realize the value code scanning can bring to your team: a low-barrier, non-intrusive way to ensure code quality and security, so download SonarQube, Spotbugs, Checkstyle, etc., and simply configure it to run on the local or Jenkins pipeline. But since code scanning is more of a local offline tool, why does CODING provide code scanning online?

Local scan, and the rules are synchronized with the remote

Even if it is a local scan, we do not want the local rules to be different from the remote rules, so that the local scan is submitted and then rejected. The most reasonable way to solve this problem is IaC, where scanning schemes and filtering conditions are stored as local configuration files.

However, not all tool rule configurations can be managed locally, such as filtering conditions, comparison branches, and other configuration items that are strongly related to application scenarios. For this kind of appeal, there are two ways to deal with it:

Users complete unified configurations (including tool rules, filtering conditions, and comparison branches) on the platform and generate configuration ids after the configurations are complete. Local scanning is based on the remote configuration ID instead of the local configuration file.

codedog_client localscan --config 001
Copy the code

Localization of the platform configuration, that is, scanning the platform defines the complete rule format. This configuration is followed not only in local scanning, but also in platform display, which can parse the file configuration to generate a visual display, thus achieving a unified IaC configuration.

Refine to the person, the problem into responsibility

Local scanning can find problems, but it is difficult to find the introduction and timing of the problem, so there is the possibility of entanglement and prevarication. Based on the code submission record, the platform can trace back to the code change time and find the person responsible for the problem, so as to track the problem from the person responsible view, or even turn it into a special Bug follow-up. It makes sense that he who pollutes will clean up.

In addition, the platform can automatically close current fixed code problems based on the results of the next scan, saving manual operation.

Code base quality tracking

Archiving issues on the platform also provides a clear picture of the code quality trends of a repository, such as a point in time when a new issue was introduced that led to a deterioration in overall quality, or a point in time when historical load quality improvements were lifted. Visualized quality trends can also help team managers determine if they need to raise alarm bells about the quality of their team’s code.

Quality access control, let bygones be bygones

Just do local scanning, or there will be “big heart” developers do not fix the problem directly push to the remote end, then you can use the quality access control function provided by the platform side to do interception. The quality control can define the number of issues that the warehouse can currently allow. When the number of issues exceeds, the submit or merge request will be blocked.

Typically, a history project will sweep up hundreds of legacy issues at a time when it is scanned, and it is unlikely that the team will set aside time to fix them all at once, leading to “boot out”. Our advice is aimed at this kind of scene, set up the quality of MR entrance guard for the number of new problems, ensure that the code or trendy won’t have to introduce new code quality problem, control the incremental and clean up the stock gradually at the same time problem (business requirements will change to which document, is to repair the file code quality problems), in this way will slowly code quality back on track.

conclusion

To some extent, we recognize that the larger the team, the greater the need for code scanning tools to help teams improve standards and efficiency in the face of normative and complex issues. For SMB and individual developers, code scanning also remains the cheapest quality improvement tool to access. I hope that the above cases and scenarios can help readers quickly locate and solve the problems in the project, pay attention to each line of code iteration, and inherit the code culture of excellence.

Click experience code scan tool to improve team efficiency