When we start learning Golang programming, the first step is usually to write a Hello World program of about five lines. Then the second step is usually to write a simple HTTP server, usually no more than 100 lines. Then you basically jump to projects with thousands or even tens of thousands of lines of code, with very few people telling you how to organize your code and how to write tests. It feels like someone gives you a paddle and a canoe and tells you, go ahead, go cross the Pacific.

In fact, the gaps in the middle are not without reference, but first they are very scattered, need to search everywhere to find. Second, they often hold different views, easy to find the focus. Trying to find the answer from some open source project is even more confusing, because there is almost no consistent approach.

When we first started writing the Golang project, we struggled with these two basic questions. After a lot of trial and error, we finally have the answer to both of these questions in the SmartX version. In our way of organizing, the code at each functional level is well isolated to make maintenance and testing easier. In addition, we summarize some test related practices to achieve better test efficiency and test quality.

Three unadopted ways of organizing code

Before we get to the way we ended up organizing our code, let’s talk about what we tried and why they didn’t work.

Method 1: single package

Many Golang tutorials are organized this way, and some are even single source files. Admittedly, for projects that don’t have a lot of code, or that don’t require team work, this is efficient and doesn’t cause too many problems.

However, if the code is a bit heavy, or if there is a collaborative project, this organization exposes the problem of not being able to hide the implementation within the structure. Unlike C++ or Java, the minimum granularity of Golang’s code visibility is package. For example, in the following C++ code, we cannot call the private function PrivateFunc outside of Foo:


Because C++ has class-granularity visibility, it is possible to hide the inner implementation from the outside of the class even if it is in the same source code file as the implementation of the class, so as to achieve code encapsulation. So go to Golang. In the following example, we can call the foo structure’s internal function, internalFunc, as long as the caller and foo are in the same package:


This means that internal implementations cannot be hidden from each other within the same package. Why Golang ditched the visibility feature of class granularity is beyond the scope of this article, but it at least means that we need to subcontract code for effective encapsulation to avoid unexpected situations where internal implementation details are dependent.

Method 2: Divide by MVC

MVC is a classic code function division mode, many frameworks are based on this code organization, such as Ruby on Rails, the master of Web framework. In Golang’s case, the more popular Beego framework also adopts this approach. Since subcontracting is considered, the first idea is to divide according to MVC.

In the case of Beego’s Todo, the contents of the controllers package and models package are roughly as follows:



After this subcontracting, the Controller layer can only access the public interface provided by the Model layer, which initially achieves our goal of hiding the implementation. Of course, there is still no hiding the implementation inside their respective packages, but at least the isolation between the layers has been achieved.

There is, however, one very awkward thing about this approach. Consider when we need outside the controllers package reference TaskController, code will appear controllers. TaskContoller such form. Golang’s official Effective Go doesn’t think it’s a good way to name packages that lead to repeated phrases on referees. The nature of this problem leads to another unique feature of Golang, which is that constants, variables, functions, constructs, and interfaces imported from other packages need to be referenced with the package prefix. Golang can also use dot import to remove the prefix. Unfortunately, this practice is not routine and is not recommended. Perhaps Golang have similar Python from controllers import TaskController or Java import controllers. The selectivity of TaskController such import The situation would improve a lot with the mechanism.

Method 3: By module

MVC is divided horizontally by functional level, as opposed to vertically by module, which is a common way of dividing MVC. If we continue with the Todo example above, then task-related controllers and models will be placed under the Tasks Package:



Outside the tasks package reference some of the definition, the original controllers. TaskController into tasks. Controller, looks a lot better. But the original models.Task has also become tasks.Task.

In addition, another important reason for not doing this is that our projects are based on the concept of microservices, so usually a project consists of only one module. Given that, if you continue this way, you’re almost back to a single package.

Better organization: by dependency

After discarding the above three unsatisfactory methods, we had to search the search engine for a better answer. It was not difficult to find this technical article, which introduced a very good way to organize. Coincidentally, the author of the article also experienced the same confusion and attempts as us, and finally formed the conclusion in his article. In order to facilitate the students who cannot access the original link to understand, some code from the original text to briefly introduce.

First, the root package needs to define the domain of the entire project, independent of any other packages in the project:


Next, the implementation code is packaged according to external dependencies. For example, if UserService relies on PostgreSQL as the storage implementation, then you can include the implementation code in a Postgres subpackage:


From an MVC perspective, the UserService here is at the Model level. On top of this is the View-Controller layer that connects to the HTTP API. As you can imagine, the View-Controller layer needs to rely on and make use of UserService, so I don’t want to expand the code here. The key here is that packages are no longer named models or controllers, but instead named after external dependencies. When UserService is referenced outside the package, it will take the form postgres.userService, which is very concise and easy to understand.

In addition, it should be noted that this organization is not just a simple change of name, it also abstracts the domain. This layer of abstraction provides some flexibility. Imagine, for example, that the new mongodb-based UserService implementation is transparent to the view-Controller at the upper level if you need to migrate to MongoDB as a data store. No matter what implementation UserService is based on, as long as it conforms to the UserService interface, it is seamlessly interchangeable for its users. Furthermore, this layer of abstraction also makes testing a lot easier, which we’ll expand on later in the section on testing.

It can be said that this approach solves the respective problems of the first three methods, which is a very good idea. In the early days of some of our projects, we adopted this organization directly. However, as more and more business logic was added to the implementation of services, we found it necessary to make some improvements.

Improvements: Separation of business logic

As functionality was added, more and more business logic was added to our Service implementation code, including user authentication, permission validation, field default populating, data validity checking, external service invocation, mail sending, and so on. If we want to add a memory cache in front of PostgreSQL, we have a dilemma:

  • If the UserService is wrapped in an implementation like UserCache above, it will skip the important business logic of user authentication and permission validation in the underlying UserService when the data is in the cache and return the data directly
  • Otherwise, you have to write the cache logic to the existing UserService, which reduces maintainability and testability

A brief analysis revealed that the problem was that we were mixing external dependencies (databases, mail services, etc.) with business logic. For the most part, business logic and external dependencies are two things that change independently. For example, when we add new permissions to an API, it usually doesn’t matter what database we use; Adding a cache layer to the front of a database usually does not affect the logic of data validity checks. If the two things were separate, the above dilemma should not arise.

Continuing with the example above, we can separate the database operations into a UserRepository interface, and the Postgres package implements the UserRepository interface instead:


Then put the UserService in another package and contain only pure business logic:


In other words, the original model plus the view-Controller two-layer structure is further differentiated into the three-layer structure as shown in the figure below:


The API protocol layer is at the top. Examples of common RESTful apis and GRPC are shown in the figure. In this layer, we parse requests, such as extracting parameters from urls, decode request data from the Request body, and so on. The appropriate Service interface is then called, and the data or error message returned by the interface is encoded, and finally returned from the network to the client.

The blue in the middle is the business logic layer. It is usually used to implement user authentication, permission verification, data validity check, and external interface (such as data access and mail sending) invocation. After our abstraction and stripping above, it does not contain concrete implementations of external interfaces, but only consumers of those interfaces.

The bottom part in blue is the implementation layer of the external interface. This layer receives specific dependencies on external services, such as databases, caching, mail services, payment services, and so on. In our example, there is only one UserRepository interface, but more external interfaces can be abstracted as needed in a real project, as shown in gray at the bottom right.

If we want to add a cache layer to UserRepository, then the UserService and other UserRepository implementations need not be changed. We just need to add UserRepository interface with cache. This is the UserCache part of the diagram. As you can see, this additional layer further isolates business logic from external dependencies, making the code more flexible and easier to maintain. Most of our Golang projects on SmartX have taken this approach to code organization, and it’s worked out pretty well. A typical directory structure is as follows:

. ├ ─ ─ CMD / │ └ ─ ─ myapp / │ └ ─ ─ main. Go ├ ─ ─ server / │ ├ ─ ─ the repository / │ │ ├ ─ ─ mongo / │ │ │ ├ ─ ─ repository. Go │ │ │ └ ─ ─ Repository_test. Go │ │ ├ ─ ─ postgres / │ │ │ ├ ─ ─ repository. Go │ │ │ └ ─ ─ repository_test. Go │ │ └ ─ ─ repositorytest / │ │ └ ─ ─ tester. Go │ ├ ─ ─ service. Go │ └ ─ ─ service_test. Go ├ ─ ─ transport / │ ├ ─ ─ GRPC / │ │ ├ ─ ─ transport. Go │ │ └ ─ ─ Transport_test. Go │ └ ─ ─ a restful / │ ├ ─ ─ transport. Go │ └ ─ ─ transport_test. Go └ ─ ─ service. GoCopy the code

In the directory structure above, we also see some test-related files and even packages. Next, let’s talk about the organization of testing.

What to measure and how

Before discussing how to test, let’s make a quick distinction between unit testing and integration testing.

Unit testing vs integration testing

Unit tests are usually written by the developer of the code, based on an understanding of the internals of the code, and are white box testing. Typically, we mock a function under test, given a series of inputs and external objects, and then test how it interacts with the external objects and whether the final output is as expected. Because external dependencies are often mock, unit tests tend not to need to configure external dependencies such as databases, mail servers, etc., and are generally easier to write and perform. It can be executed at any time during development, and most code bugs can be found through unit testing.

The integration test is used to judge whether the overall behavior of several modules of the system meets the expectation after they are combined. Integration testing is black box testing, which is not based on understanding implementation details, but only tests externally visible public interfaces. In addition, integration tests typically do not mock, but include all of their external dependencies. In other words, to perform integration testing, you mostly need to configure external dependencies such as databases and mail servers. These external-dependent interactions are typically several orders of magnitude slower than executing a few lines of code, so integration tests are typically more expensive to execute than unit tests.

End-to-end testing, also known as E2E testing, is a special kind of integration testing. It refers to connecting all the parts of the program, and then standing outside the whole program, testing whether the input and output of the program meet the expectations from the user’s perspective.

Our testing strategy is to use as much unit testing as possible to achieve adequate coverage, and then combine integration testing with end-to-end testing to further ensure proper coupling between modules. This maximizes the efficiency of our testing without compromising the quality of our testing.

Unit test API protocol layer

The main external dependencies of the API protocol layer are the Domain Interface and the specific transport protocol implementation. For domain interfaces, you can easily mock them with a tool like Gomock. For the transport protocol, if it is pure HTTP, it can be easily tested using the HttpTest package in the Golang standard library.

In this layer of testing, we focus on whether it can handle different request inputs, such as invalid links, incorrect body encoding, etc., and whether it can correctly translate the expected response when processing different data or errors returned by the Domain interface. Includes HTTP Response Code and Response Body. Processing logic in different situations can be overridden by tweaking input and mock behavior.

Unit test the business logic layer

The main external dependencies of the business logic layer are the set of external interfaces it defines, such as UserRepository. We can also mock code for these interfaces using Gomock.

In the test of this layer, we mainly pay attention to whether it correctly handles internal logic such as user authentication, permission verification and data validity check, as well as the interaction behavior with external interface and the further behavior based on the response of external interface. With the input of the function and the mock behavior of the interface, we can also override every internal logic.

Integration tests external dependencies

For external dependencies, we integrate them and test them as a whole. The main reason for not using unit testing here is that much of the point of testing is lost if you mock out external dependencies. For example, if you mock out the SQL interface and ununit test the UserRepository implementation of PostgreSQL, the SQL statement may not be tested for validity and its actual execution may not be tested either. If this part of the code is discarded, the remaining UserRepository code is a small part of the test and does not meet the objectives.

In this layer of integration testing, we focus on whether the function of each interface is correct, which is actually a kind of functional test. For example, the User function should be able to return an existing User id if given, and an appropriate error if the given ID does not exist. Regardless of the specific implementation of the interface, they should have consistent functionality from a usage point of view, so that different implementations of the same interface can share their integration test samples. This corresponds to the repositoryTest package in our directory structure above, which contains the sample functional tests common to UserRepository:


Note that the fingerprint of the Test_UserRepository_User function does not match the fingerprint of Golang’s test function (with tester parameters), so it will not be executed as a test. If you go to repositoryTest, it’ll tell you the no test file. The real test is in each UserRepository package:


E2E tests the entire program

The above three tests are more isolated for each level, but so far, they have not functionally tested the program as a whole, or E2E testing. As mentioned earlier, E2E testing is about testing the external interface of a program from the user’s perspective. As a typical back-end program, its external interface is the API interface exposed on the network, and the user is the caller of API interface. Therefore, E2E tests mainly test the functions of API interfaces.

The specific test method is generally to run the server program, and then use an API client to test its docking. In the case of RESTful apis, there are many OPTIONS for API clients, including some RESTful API development and debugging tools such as Insomnia, scripting languages such as Python, and of course, E2E tests can be written with Golang. If it is the GRPC API, you can generate a client in a suitable language from the Proto file and then write some test samples to test it.

It is worth noting that trying to cover all corner cases in E2E testing is not a bad thing, but it is often laborious and time-consuming, and sometimes completely impossible to achieve. Considering that corner Case is more easily covered by unit tests, our requirement for E2E tests is to cover the main functional path, not everything. In addition, every time a bug is encountered, a test sample covering the bug needs to be added to ensure that the same problem can be tested in the future.

conclusion

Ask people what is the fastest growing programming language in recent years, and Golang will probably be on the list. For a programming language that was only launched in 2009, Golang has certainly grown very quickly. At present, with the promotion of Google and the community, Golang has become the main language of cloud infrastructure, and some mainstream open source projects such as Docker and Kubernetes are written in Golang.

At SmartX, C++ and Python have always been our primary languages. C++ is the development language for our distributed storage ZBS. Using C++ gives us complete control over memory usage and allows us to perform some performance optimizations more thoroughly. Python is the development language for other parts of our hyper-converged product, including our virtualization platform Elf. When designing the new architecture last year, we identified some functional modules to add and some old ones that were not ideal and needed to be replaced. After comparing and trying some potential solutions, we finally decided to use Golang to develop these new modules.

When we started using Golang to write some back-end projects, we were surprised by Golang’s simplicity and efficiency in developing back-end projects, but confused by the organization of the project code. To do this, we have tried to use the organization recommended by other languages or frameworks, and we have searched and implemented the organization suggested by others, but with no satisfactory results. In addition, there is no ready answer to the question of how to practice efficient, high-coverage automated testing in Golang. Drawing on the ideas of others, and through the practice and adjustment of several of our projects, the final formation of a set of our Own SmartX Golang code organization and testing methodology. Through this paper, we hope to share our practice, experience and conclusions on these two issues, and hope that it will be helpful to students who know Golang for the first time.

The authors introduce

Ye Feng,

@fengye87
SmartX