At ThoughtWorks, I built software projects from scratch, including basic code frameworks and continuous integration infrastructure, which in agile development are often referred to as “iteration 0.” However, when the project has been running for a while, I always find some shortcomings, either the test categories are not well sorted or the basic coding framework is not well thought out.
In addition, I encountered many existing projects in my work, both inside and outside the company, and I was not satisfied with the coding practices of most projects. For example, when I joined a new project, I had to consult three colleagues before the project was put into operation locally. For example, in another project, I found that the Java class naming conventions corresponding to the front-end Request were inconsistent, with some being suffixed with Request and some being suffixed with Command.
Moreover, after working for so many years, I increasingly find the importance of basic knowledge and systematic learning. It is true that the development of technical frameworks makes it possible to implement business functions quickly, but when the software goes wrong, it sometimes requires the integration of all aspects of knowledge and brain reaction to find a solution.
Based on the above, I hope to put together a set of public project templates designed to cover as much of the day-to-day development needs as possible, reduce developer rework and provide some best practices. For back-end development, I chose Spring Boot, which is widely used in the industry, and sorted out a set of common and basic practices based on it. After combining my own experience and the excellent practices of other projects, I concluded this article for the benefit of developers.
This article takes a simple e-commerce order system as an example, the source code please visit:
Github.com/e-commerce-…
The main technology stacks used include: Spring Boot, Gradle, MySQL, Junit 5, Rest Assured, Docker, etc.
Step 1: Start by writing the README
A good README can give an overview of the project, help newcomers get started quickly, and reduce communication costs. At the same time, a README should be concise and clear, with recommendations that include:
- Project description: Describe in one or two sentences the business functions implemented by the project;
- Technology selection: list the technology stack of the project, including language, framework, middleware, etc.
- Local build: lists tool commands used during local development.
- Domain model: core domain concepts, such as Order and Product for example e-commerce system;
- Testing strategy: How automated tests are classified, which tests must be written and which tests are not;
- Technical architecture: Technical architecture diagram;
- Deployment Architecture: Deployment architecture diagram;
- External dependencies: The external integrators on which the project runs, such as the order system, depend on the membership system;
- Environment information: access mode of each environment, database connection, etc.
- Coding practices: uniform coding practices, such as exception handling principles, paging encapsulation, etc.
- FAQ: Answers frequently asked questions during development.
It is important to note that the information in the README may change as the project evolves (such as introducing a new technology stack or adding a new domain model) and therefore needs to be updated continuously. While we know that one of the pain points of software documentation is that it doesn’t keep pace with the actual progress of the project, when it comes to README information, developers are advised not to skimp on a little keystroke time.
Additionally, in addition to keeping the README up to date, important architectural decisions can be recorded in the code base in the form of sample code that new developers can read directly to quickly learn about common practices and architectural choices for projects. See ThoughtWorks’ technical radar.
One-click local build
To avoid the embarrassment of “asking 3 colleagues to build locally” mentioned above, to reduce manual work by “lazy” programmers, and to provide a consistent development experience for all developers, we wanted to do everything with a single command. Here, I summarize the following commands for different scenarios:
- Generate IDE project:
idea.sh
Generate IntelliJ project files and automatically open IntelliJ - Run locally:
run.sh
, local start project, automatically start local database, listen to debug port 5005 - Local build:
local-build.sh
, the code can only be committed if the local build succeeds
The above three commands can basically complete the needs of daily development. At this point, the development process for newcomers is roughly as follows:
- Pull code;
- run
idea.sh
, automatically open IntelliJ; - Write code, including business code and automated tests;
- run
run.sh
, perform local debugging or manual testing if necessary (this step is not required); - run
local-build.sh
To complete the local build; - Pull the code again, guaranteed
local-build.sh
Success, submit code.
In fact, the contents of these command scripts are quite simple, such as the run.sh file:
#! /usr/bin/env bash
./gradlew clean bootRun
Copy the code
However, this explicit command can reduce the fear of newcomers because they only need to know to run these three commands to start developing. And one small detail: The local-build.sh command could have been renamed to the simpler build.sh command, but when Tab is used on the command line, the auto-completion is in the build directory instead of the build.sh command. Hence the name local-build.sh. The details are small, but they reflect the principle that we want to give developers a minimalist development experience, and I call these seemingly trivial things “humanistic care” for programmers.
The directory structure
The directory structure advocated by Maven is now the de facto industry standard, and Gradle uses Maven’s directory structure by default, which is sufficient for most projects. In addition to Java code, there are other types of files in the project, such as Gradle plug-in configurations, tool scripts, and deployment configurations. In any case, the principle of the project directory structure is to be simple and organized, not to add extra folders arbitrarily, and to be refactored in a timely manner.
In the example project, there are only two folders at the top level, a SRC folder for Java source code and project configuration, and a Gradle folder for all Gradle configuration. In addition, for developers’ convenience, the three common scripts mentioned above are placed directly in the root directory:
├─ ├─ SRC // Java source Code ├─ design.sh // Generate IntelliJ Engineering ├─local-build.sh // Local build ├ ─ run.sh // Run locallyCopy the code
For Gradle, we deliberately place the Gradle plugin script with the plugin configuration, such as Checkstyle:
├ ─ ─ gradle │ ├ ─ ─ checkstyle │ │ ├ ─ ─ checkstyle. Gradle │ │ └ ─ ─ checkstyle. The XMLCopy the code
In fact, by default the Checkstyle plug-in looks for the checkstyle.xml configuration file from the config directory at the root of the project, but this increases the number of redundant folders and scatters the plugin-related facilities in different places, violating the general principle of cohesion.
Subcontracting based
The early Java subcontracting approaches were usually based on technology, such as the Controller package, Service package, infrastructure package and so on, which were at the level of domain package. This approach is not currently advocated by the industry and should be based primarily on subcontracting. For example, in the Order sample project, there are two important domain objects Order and Product (called aggregation roots in DDD) around which all business is centered, so the Order package and Product package are created separately, and the associated sub-packages are created under each package. The order package is as follows:
├ ─ ─ the order │ ├ ─ ─ OrderApplicationService. Java │ ├ ─ ─ OrderController. Java │ ├ ─ ─ OrderNotFoundException. Java │ ├ ─ ─ OrderRepository. Java │ ├ ─ ─ the OrderService. Java │ └ ─ ─ model │ ├ ─ ─ the Order. The Java │ ├ ─ ─ OrderFactory. Java │ ├ ─ ─ OrderId. Java │ ├─ ├─ ├.class.javaCopy the code
As you can see, we put OrderController and OrderRepository classes directly under the ORDER package, without the need for separate subpackages for these classes. For the domain model Order, since there are multiple objects, they are grouped into the Model package based on the principle of cohesion. But this is not necessary, if the business is simple, we can even put all classes directly under the business package, as the Product package does:
└ ─ ─ the product ├ ─ ─ product. Java ├ ─ ─ ProductApplicationService. Java ├ ─ ─ ProductController. Java ├ ─ ─ ProductId. Java └ ─ ─ ProductRepository.javaCopy the code
In coding practice, we always implement code based on a business use case. In technical subcontracting scenario, we need to switch back and forth among scattered packages, which increases the cost of code navigation. In addition, the change content of the code submission is also scattered. When viewing the history of the code submission, it is not intuitive to see what the business function of the submission is. Under business subcontracting, we only need to modify the code in a single unified package, which reduces the cost of code navigation. Another benefit is that if we ever need to migrate a business to another project (for example, if a separate microservice is identified), we can simply move the business package as a whole.
Of course, subcontracting based on business does not mean that all code must be under the business package. The logic here is: first subcontracting, then subcontracting separately for code that is not part of any business, such as util classes, common configuration, etc. For example, we can still create a common package with sub-packages such as Spring common configuration, exception handling framework, and logging:
├─ general Exercises ─ Heavy exercises ─ heavy exercisesCopy the code
Automated test classification
In the current microservices and front-end separation development model, back-end projects provide pure business apis without UI logic, so backend projects no longer contain heavyweight end-to-end testing such as WebDriver. At the same time, back-end projects should be tested at the API level as stand-alone operating units that provide business functionality to the outside world.
In addition, there is some framework code, either technical framework code such as Controller, or code based on an architectural style (such as ApplicationService in DDD practice), that does not contain business logic on the one hand. On the one hand, a thin layer of abstraction (that is, a relatively simple implementation) makes it unnecessary to cover it with unit tests, so the author’s view is that it is not necessary to write separate unit tests for this. In addition, we can create a special type of test called component test because some important component code in a program, such as accessing a Repository or distributed lock, is “undetectable” using unit tests, while using API tests does not make logical classification.
Based on the above, we can classify automated tests:
- Unit testing: core domain model, including domain objects (such as Order class), Factory class, domain service class, etc.
- Component tests: Classes that are not suitable for writing unit tests but must be tested, such as the Repository class. In some projects, these types of tests are also called integration tests.
- API test: simulate the client to test each API interface, need to start the program.
Gradle only provides the SRC /test/ Java directory for testing by default. For these three types of tests, we need to separate them for easy management (and separation of responsibilities). To do this, you can classify the test code using SourceSets provided by Gradle:
sourceSets {
componentTest {
compileClasspath += sourceSets.main.output + sourceSets.test.output
runtimeClasspath += sourceSets.main.output + sourceSets.test.output
}
apiTest {
compileClasspath += sourceSets.main.output + sourceSets.test.output
runtimeClasspath += sourceSets.main.output + sourceSets.test.output
}
}
Copy the code
At this point, the three types of tests can be written in the following directories:
- Unit tests:
src/test/java
- Component testing:
src/componentTest/java
- The API test:
src/apiTest/java
It is important to note that API testing here focuses more on business function testing. Some projects may also have contract testing and security testing, etc. Although these tests are technically API access, they are separate concerns and therefore recommended to be treated separately.
It is worth mentioning that since component tests and API tests need to start the program, that is, to prepare the local database, we use Gradle’s Docker-compose plug-in (or jib plug-in), which will automatically run the Docker container (such as MySQL) before running the test:
apply plugin: 'docker-compose'
dockerCompose {
useComposeFiles = ['docker/mysql/docker-compose.yml']
}
bootRun.dependsOn composeUp
componentTest.dependsOn composeUp
apiTest.dependsOn composeUp
Copy the code
For more details on test classification configuration, such as JaCoCo test coverage configuration, refer to the sample project code in this article. Readers unfamiliar with Gradle can refer to my Gradle Learning series.
Log processing
In addition to completing the basic configuration, there are two points to consider in logging:
- Add request identifiers to logs to facilitate link tracing. Multiple logs are sometimes produced during the processing of a request, and if each log shares a common request ID, log tracing can be more convenient. At this point, you can use the MDC(Mapped Diagnostic Context) functionality provided by Logback natively to create a RequestIdMdcFilter:
protected void doFilterInternal(HttpServletRequest request,
HttpServletResponse response,
FilterChain filterChain)
throws ServletException, IOException {
//request id inheader may come from Gateway, eg. Nginx String headerRequestId = request.getHeader(HEADER_X_REQUEST_ID); MDC.put(REQUEST_ID, isNullOrEmpty(headerRequestId) ? newUuid() : headerRequestId); try { filterChain.doFilter(request, response); } finally { clearMdc(); }}Copy the code
- Centralized log management. In a multi-node deployment scenario, logs on each node are scattered. You can use tools such as ELK to export logs to ElasticSearch in a unified manner. This article’s example project uses RedisAppender to export logs to Logstash:
<appender name="REDIS" class="com.cwbase.logback.RedisAppender">
<tags>ecommerce-order-backend-${ACTIVE_PROFILE}</tags>
<host>elk.yourdomain.com</host>
<port>6379</port>
<password>whatever</password>
<key>ecommerce-ordder-log</key>
<mdc>true</mdc>
<type>redis</type>
</appender>
Copy the code
Of course, there are many other unified logging solutions, such as Splunk and Graylog.
Exception handling
When designing a framework for exception handling, consider the following:
- Provide a uniform format for exception returns to clients
- The exception information should contain enough contextual information, preferably structured data, to be easily parsed by the client
- Different types of exceptions should contain unique identifiers for accurate identification by clients
Exception handling usually takes two forms. One is hierarchical, in which each specific exception corresponds to an exception class that eventually inherits from a parent exception. The other is unitary, that is, there is only one exception class in the whole program, and a field to distinguish different exception scenarios. The advantage of hierarchical exception is that it can explicitly define the meaning of the exception. However, if the hierarchy is not well designed, the whole program may be flooded with a large number of exception classes. Simplicity has the advantage of simplicity, but its disadvantage is that it is not ideographic enough.
The sample project in this article uses hierarchical exceptions, all of which inherit from an AppException:
public abstract class AppException extends RuntimeException {
private final ErrorCode code;
private final Map<String, Object> data = newHashMap();
}
Copy the code
Here, the ErrorCode enumeration contains the unique identifier of the exception, the HTTP status code, and the error message; The data field represents the context information for each exception.
In the example system, an exception is thrown when the order is not found:
public class OrderNotFoundException extends AppException {
public OrderNotFoundException(OrderId orderId) {
super(ErrorCode.ORDER_NOT_FOUND, ImmutableMap.of("orderId", orderId.toString())); }}Copy the code
When returning an exception to the client, use an ErrorDetail class to unify the exception format:
public final class ErrorDetail {
private final ErrorCode code;
private final int status;
private final String message;
private final String path;
private final Instant timestamp;
private final Map<String, Object> data = newHashMap();
}
Copy the code
The final data returned to the client is:
{
requestId: "d008ef46bb4f4cf19c9081ad50df33bd",
error: {
code: "ORDER_NOT_FOUND",
status: 404,
message: "No order found",
path: "/order",
timestamp: 1555031270087,
data: {
orderId: "123456789"}}}Copy the code
As you can see, ORDER_NOT_FOUND corresponds to the data structure in data. In other words, for the client, if ORDER_NOT_FOUND is found, then the presence of the orderId field in the data can be confirmed, thus completing the precise structural parsing.
Background tasks and distributed locks
In addition to the immediate completion of client requests, there are usually some periodic routine tasks in the system, such as regularly sending emails to users or running data reports, etc. In addition, sometimes requests are handled asynchronously by design. At this point, we need to set up the infrastructure related to background tasks. Spring natively provides TaskExecutor and TaskSchedulor mechanisms; In distributed scenarios, distributed locks are also needed to solve concurrency conflicts, so we introduce a lightweight distributed lock framework ShedLock.
The Spring task configuration is as follows:
@Configuration
@EnableAsync
@EnableScheduling
public class SchedulingConfiguration implements SchedulingConfigurer {
@Override
public void configureTasks(ScheduledTaskRegistrar taskRegistrar) {
taskRegistrar.setScheduler(newScheduledThreadPool(10));
}
@Bean(destroyMethod = "shutdown")
@Primary
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(2);
executor.setMaxPoolSize(5);
executor.setQueueCapacity(10);
executor.setTaskDecorator(new LogbackMdcTaskDecorator());
executor.initialize();
returnexecutor; }}Copy the code
Then configure Shedlock:
@Configuration
@EnableSchedulerLock(defaultLockAtMostFor = "PT30S")
public class DistributedLockConfiguration {
@Bean
public LockProvider lockProvider(DataSource dataSource) {
return new JdbcTemplateLockProvider(dataSource);
}
@Bean
public DistributedLockExecutor distributedLockExecutor(LockProvider lockProvider) {
returnnew DistributedLockExecutor(lockProvider); }}Copy the code
Implement background task processing:
@Scheduled(cron = "0 0/1 * * *?")
@SchedulerLock(name = "scheduledTask", lockAtMostFor = THIRTY_MIN, lockAtLeastFor = ONE_MIN)
public void run() {
logger.info("Run scheduled task.");
}
Copy the code
To support code that calls distributed locks directly, create a DistributedLockExecutor based on Shedlock’s LockProvider:
public class DistributedLockExecutor {
private final LockProvider lockProvider;
public DistributedLockExecutor(LockProvider lockProvider) {
this.lockProvider = lockProvider;
}
public <T> T executeWithLock(Supplier<T> supplier, LockConfiguration configuration) {
Optional<SimpleLock> lock = lockProvider.lock(configuration);
if(! lock.isPresent()) { throw new LockAlreadyOccupiedException(configuration.getName()); } try {returnsupplier.get(); } finally { lock.get().unlock(); }}}Copy the code
Call directly in code when used:
public String doBusiness() {
return distributedLockExecutor.executeWithLock(() -> "Hello World.",
new LockConfiguration("key", Instant.now().plusSeconds(60)));
}
Copy the code
The example project in this article uses a JDBC-based distributed lock, and virtually any mechanism that provides atomic operations can be used for distributed locks. Shedlock also provides distributed lock implementations based on Redis, ZooKeeper, and Hazelcast.
Uniform code Style
In addition to the Checkstyle uniform code format, there are some common coding practices common to the project that need to be aligned across the development team, including but not limited to the following:
- The client request data classes use the same suffix, such as Command
- Data returned to the client uses the same suffix, such as Represetation
- Unify the process framework for request processing, such as the traditional 3-tier architecture or DDD tactical pattern
- Provide consistent exception returns (see the “Exception Handling” section)
- Provides a uniform paging structure class
- Clear test categories and a unified test base class (see the “Automation Test Classification” section)
Static code inspection
Static code checks mainly include the following Gradle plug-ins. For details, see the sample code in this article:
- Checkstyle: Used to check code formats and standardize coding styles
- Spotbugs: The successor to Findbugs
- Dependency Check: Java class library security check provided by OWASP
- Sonar: Tracking code for continuous improvement
Health check
The health check is mainly used in the following scenarios:
- We want an initial check to see if the program is working properly
- Some load balancing software determines node reachability through a health check URL
At this point, you can implement a simple API interface that is not subject to permission control and can be publicly accessed. If the interface returns the HTTP 200 status code, the program is initially considered healthy. In addition, we can add additional information to the API, such as the commit version number, build time, deployment time, and so on.
To start the sample project for this article:
./run.sh
Copy the code
And then access the health check API: http://localhost:8080/about, the results are as follows:
{
requestId: "698c8d29add54e24a3d435e2c749ea00",
buildNumber: "unknown",
buildTime: "unknown",
deployTime: "The 2019-04-11 T13:05:46. 901 + 08:00 Asia/Shanghai." ",
gitRevision: "unknown",
gitBranch: "unknown",
environment: "[local]"
}
Copy the code
The above interface is implemented using a simple Controller in the sample project, and in fact Spring Boot’s Acuator framework can provide similar functionality.
The API documentation
The difficulty of software documentation is not writing it, but maintaining it. How many times have I walked down a project document and got the wrong answer, asked a colleague and got the reply “Oh, that’s outdated”? The Swagger used in this article’s sample project reduces the cost of API maintenance to some extent because Swagger automatically identifies method parameters, returned objects, and urls in code, and then automatically creates API documents in real time.
Configure Swagger:
@Configuration
@EnableSwagger2
@Profile(value = {"local"."dev"})
public class SwaggerConfiguration {
@Bean
public Docket api() {
return new Docket(SWAGGER_2)
.select()
.apis(basePackage("com.ecommerce.order")) .paths(any()) .build(); }}Copy the code
Start a local project, visit http://localhost:8080/swagger-ui.html:
Database Migration
In the traditional development mode, the database is maintained by a special operation and maintenance team or DBA. If you want to modify the database, you need to apply to THE DBA and inform the content of the migration. Finally, the DBA is responsible for the implementation of the database change. In continuous delivery and the DevOps movement, this work is gradually brought forward into the development process, not that dbAs are not needed, but that it can be done by developers and operations together. In addition, in microservice scenarios, the database is contained within the boundaries of a single service, so based on the principle of cohesion (gee, this is the third time this article has mentioned cohesion, so it is important in software development), changes to the database are best maintained in the code base along with the project code.
This article’s sample project Flyway is adopted as a database migration tool, after joining the Flyway dependence, in SRC/main/sources/db/migration directory to create migration script files:
Resources / ├ ─ ─ the db │ └ ─ ─ migration │ ├ ─ ─ V1__init. SQL │ └ ─ ─ V2__create_product_table. SQLCopy the code
The migration script must be named according to certain rules to ensure the script execution sequence. In addition, do not arbitrarily modify the migration file after it takes effect, because Flyway will check the checksum of the file. If the checksum is inconsistent, the migration will fail.
Multi-environment construction
In the software development process, we need to deploy the software to multiple environments and go through several rounds of verification before finally going online. In different stages, the running state of the software may be different. For example, the third-party system may be stubbed out during local development. Continuous integration may be built with in-memory databases for testing, and so on. For this purpose, the sample project in this article recommends the following environment:
- Local: used by developers for local development
- Ci: For continuous integration
- Dev: For front-end development tuning
- Qa: For testers
- Uat: Production-like environment for functional acceptance (sometimes called a staging environment)
- Prod: Formal production environment
CORS
In a system where the front end and the back end are separated, the front end is deployed separately and sometimes the domain name is different from that of the back end. In this case, cross-domain processing is required. The traditional method can be JSONP, but this is a “trick” method. Currently, the more common practice is to use CORS mechanism. In the Spring Boot project, the configuration of ENABLING CORS is as follows:
@Configuration
public class CorsConfiguration {
@Bean
public WebMvcConfigurer corsConfigurer() {
return new WebMvcConfigurer() {
@Override
public void addCorsMappings(CorsRegistry registry) {
registry.addMapping("/ * *"); }}; }}Copy the code
For projects using Spring Security, you need to ensure that CORS work before Spring Security filters, and Spring Security provides a configuration for this:
@EnableWebSecurity
public class WebSecurityConfig extends WebSecurityConfigurerAdapter {
@Override
protected void configure(HttpSecurity http) throws Exception {
http
// by default uses a Bean by the name of corsConfigurationSource
.cors().and()
...
}
@Bean
CorsConfigurationSource corsConfigurationSource() {
CorsConfiguration configuration = new CorsConfiguration();
configuration.setAllowedOrigins(Arrays.asList("https://example.com"));
configuration.setAllowedMethods(Arrays.asList("GET"."POST"));
UrlBasedCorsConfigurationSource source = new UrlBasedCorsConfigurationSource();
source.registerCorsConfiguration("/ * *", configuration);
return source; }}Copy the code
Third-party libraries are commonly used
Here are some common third-party libraries that developers can introduce for their projects:
- Guava: Common class library from Google
- Apache Commons: A common class library from Apache
- Mockito: Mocks used primarily for unit testing
- DBUnit: Manages database test data in tests
- RestAssured: For Rest API testing
- Jackson 2: Serialization and deserialization of Json data
- JJWT: Jwt token authentication
- Lombok: Automatically generates common Java code, such as equals() methods.
- Feign: declarative Rest client
- Tika: Used to accurately detect file types
- Itext: generate Pdf files, etc
- Zxing: Generate a QR code
- Xstream: A more lightweight XML processing library than Jaxb
conclusion
This article uses an example project to talk about many aspects of setting up back-end projects at the beginning of a project. Most of these practices have been implemented in my project. After reading this article, you may find that much of it is very basic and simple. Yes, there’s nothing hard about it, but systematically laying out the infrastructure for a back-end project isn’t something every development team has already done, and that’s the purpose of this article. Finally, it is important to note that the practices mentioned in this article are only a reference. On the one hand, there are still some areas that are not considered properly. On the other hand, there are other alternatives to the technical tools used in the sample project.
For more insights from Teng Yun, please follow our wechat account ThoughtWorks Insights