Since its development, Qiuniuyun has served more than 700,000 customers with an increasingly clear and rich product matrix. Focusing on rich media scenarios, qiuniuyun has launched products such as object storage, fusion CDN acceleration, container cloud, big data platform, deep learning platform, intelligent log analysis platform, and provides one-stop intelligent video cloud solutions. How to guarantee the quality of these products is a problem that qiniuyun Engineering Efficiency Department has been engaged in and explored. Next, I will briefly introduce our specific practice and some thoughts in the direction, hoping to help you.
The overall testing strategy is mainly implemented in four directions:
• Ensure base code quality • Build business test coverage • increase quality monitoring • Popularize and improve process specifications
Ensure base code quality
Firstly, it is clear that product quality is not simply measured. It depends on all aspects of the software life cycle, and the earlier quality assurance activities are carried out, the better the effect will be. So we are very focused on the quality of the base code before the test. In terms of control, we will focus on the review of three deliverables, including requirements review, architecture review and our test design review, to ensure that the product meets the expectations of the whole team in terms of planning, implementation and acceptance standards.
We are also promoting and exploring how testing techniques can better automate the underlying code. For example, we periodically check all code bases through static scans to provide feedback to developers on what they’re doing wrong. Promote code review best practices by collecting programming specifications at the language level. Of course, the unit testing level will also focus on strengthening basic services.
In Qiniuyun, the single test coverage of core business libraries reaches 80%, such as core erasure code storage system and self-developed HTTP cache service. At the same time, we will automate the statistics of single test coverage on each Github PR, so that you can clearly see the details of coverage submitted this time and provide clear guidance for r&d and code review staff:

Build business test coverage
With the quality assurance of the basic code, we should also pay attention to the product quality from the business perspective, and do a good job in quality inspection and acceptance.
Let’s start with the iterative mode. When we receive the requirements of r & D for testing, we will first check the access standards of this stage, such as whether the single test meets the standards, whether the code has been reviewed, and if it does not meet the standards, it will be called back. If it meets the standards, it will enter the stage to be tested. After that, while fully grasping the requirements, we will deeply understand and analyze the details of specific technology implementation. On this basis, we will design or supplement specific test scenarios and then carry out the final test execution. After r & D gets the acceptance conclusion we output, it will review the result to see if there is any obvious omission. This cross-checking mechanism, coupled with white-box testing that dives into the details of the code, ensures strict quality control throughout the delivery.
Secondly, the specific strategy implemented in the test also follows the layering concept. It not only accepts the interface behavior of a single service, but also ensures the integration test between multiple services, as well as the final acceptance of the scene at the system level.
Of course, in addition to regular testing methods, there are a few other aspects of cloud service testing that we focus on.
Test acceptance in normal concurrent scenarios
Cloud services are generally distributed and highly available architectures. When we test an interface, just because a request is ok does not mean that the interface is ok. A lot of times, problems have to come out under pressure in concurrent scenarios. So in the usual test acceptance, this point needs special attention. Of course, in the past experience, we have also accumulated a lot of go language concurrency model and related testing framework experience, so that we can easily do this in the usual iteration.
High availability testing
While normal test acceptance is about making sure the system does the right thing under normal circumstances, high availability testing is about whether the system will work if the environment on which the service depends doesn’t meet expectations. Practice found that in the face of multiple computer rooms and massive machines, the probability of cloud computing infrastructure problems is very large. Common faults, such as disk damage, network failure, and machine downtime, may occur all the time. Each fault may cause data loss, system avalanche, and other major disasters, causing inestimable losses to services. So high availability testing of cloud services is especially important.
There is a very typical example, it is core storage engines we store in validation of seven NiuYun delete function recovery, need in the test environment continuously simulate various file upload and download, and random deletion request, also to the whole storage system into various random anomaly scenes, such as service hang up, disk damage, etc., and in such a scenario, Continuous uninterrupted test for more than one month, to ensure that all expected data is not lost, not damaged, the acceptance is considered. Of its standard harsh, visible spot.
Maximize test coverage
Cloud services serve a large number of users, and any serious errors are intolerable. But we know that the test matrix is infinite, in the limited test manpower, can not blindly into the infinite test. So we have been exploring how to improve test coverage. At present, there are two main directions. First, accurate testing. We have developed the go language system test coverage statistics system, which can accurately reflect the test coverage level from the source code level to assist our daily iteration. On the other hand, each iteration is checked by copying the real traffic on the line to ensure no regression problems. This scheme has been applied to the routine quality assurance of Qiniuyun CDN cache system with remarkable effect.
Quality monitoring system
Earlier we described how we ensure quality from a code quality and business validation perspective. However, in practice, there are still some scenarios and problems, which cannot be well covered by the above scenarios. Such as handle leakage, memory leakage and other problems, such problems need a certain amount of time to ferment, and purely from the business perspective, the perception is not too obvious. The quality monitoring system solves this problem for us, so we elevate the business quality monitoring results to the level of regular iterative acceptance.
In normal times, we not only accept the external behavior of the service, but also need to pay attention to whether the business invocation chain is healthy, whether the service has problems in its own running time, whether the business performance indicators are degraded, and so on. By using these methods, as many problems as possible can be detected during the testing phase, reducing the chance of problems missing online. If business acceptance is considered from the perspective of users, then business quality monitoring is measured from the whole system side. Each has its own advantages, in the quality assurance system, both are indispensable.
Popularizing and improving process specifications
Technology always has its limitations in different stages, and it is intolerable that any problems in any link will affect the quality of products and final services for customers. Therefore, in addition to conventional technical support, we also promote and popularize certain process specifications to ensure effective quality control of the whole process and each link, such as typical iteration specifications, release and online operation specifications, and accident handling procedures.
People say
The Great Talk column is dedicated to the discovery of the minds of technical people, including technical practices, technical dry goods, technical insights, growth tips, and anything worth discovering. We hope to gather the best technical people to dig out the original, sharp and contemporary sound.
Submission email: [email protected] Gleason