This post was originally posted on InfoQ: Talk about architecture,
5 years server from 0 to 200, a startup architecture savage growth history, forwarded to InfoQ public account:
The five-year history of a startup’s structure.
Founded in 2013, Beitchat is a working platform for Parents of Kindergartens in China. It is committed to helping kindergartens solve the pain points in parents’ work such as display, notification and communication through Internet products and customized solutions, so as to promote the harmony of home relations. Beitchat is the only brand jointly invested by Vtron (the first share of A-share preschool education), Tsinghua Enlightenment and netease. In just a few years, the user scale quickly reached the level of tens of millions, and DAU grew by multiple levels every year. In the face of such rapid development, the original technical architecture is difficult to support more and more complex business scenarios, in terms of system availability and stability, have brought great pressure to the technical team of Beitchat. Therefore, how to choose the appropriate technical architecture according to the current requirements and ensure the smooth evolution of architecture is worth our in-depth thinking.
There are three important historical stages in the evolution of Baychat architecture
Birth – Technical architecture selection V1.0
-
Single structure, simple structure, clear hierarchical structure;
- Rapid research and development can meet the requirements of rapid product iteration;
- No complex technology, low cost of technical learning, low cost of operation and maintenance, no need for professional operation and maintenance, saving expenses.
Growth stage – Technical architecture refactoring V2.0
-
Distributed deployment architecture, easy system expansion;
- System-level split, split the business function logic independent subsystem, and split DB;
- It preliminarily realized servitization, and used Hessian to implement RPC between systems.
- The DB is physically isolated to prevent the failure of a single DB database from causing service chain failures. At the same time, the primary and secondary databases are separated.
- MQ message queue is introduced to realize message and task asynchronization, accelerate interface response speed and improve user experience. At the same time, some message push tasks are also realized asynchronization, avoid early polling MySQL mechanism, reduce message push delay and improve message push speed.
- SLB is used to achieve Nginx load balancing. In the V1.0 architecture period, our Nginx is a single point of deployment, if one Nginx server fails, it will affect many business systems, there is a single point of failure risk. SLB is used to achieve multiple Nginx load balancing to achieve high availability and avoid single point of failure.
Outbreak – Microservices Architecture V3.0
-
Mature high-performance distributed framework, many companies are using, has withstood all aspects of performance test, relatively stable;
- It can be seamlessly integrated with the Spring framework, which is exactly where our architecture is built, and access to Dubbo is non-intrusive and convenient.
- Service registration, discovery, routing, load balancing, service degradation, weight adjustment and other capabilities;
- Open source code, can be customized according to the needs, expand functions, self-research and development;
-
With service as the center, everything is a service, and each service is encapsulated for a single business to ensure functional integrity and single responsibility;
- Loose coupling, functional independence between services, can be deployed independently, inter-dependency between services;
- High scalability, distributed resources, teamwork, unlimited scalability, higher code reuse rate.
-
Independent function logic split into microservices, independent deployment, independent maintenance;
- All system functions are realized by calling microservices, and the system cannot access DB directly.
- Dubbo long connection protocol is used for small data volume and high concurrent calls. Hessian protocol is used for large data volume services such as files, pictures, and videos.
- Each microservice maintains a separate DB.
-
Class dynamic microservice is transparent to the business caller, who only needs to call the interface without paying attention to technical implementation details.
- Code reusability: dynamic business logic of the class can be isolated and made into an independent micro-service component. The business system will no longer scatter dynamic business logic codes of the class, and code copy is no longer needed.
- Using DRDS implemented depots table, solve the single database data bottlenecks, data processing ability is limited, in the case of a single database, due to the amount of data is large, high concurrency, often encounter performance issues, interface response speed is very slow, after the implementation of depots table, dynamic interface class promoted several times, the overall performance of the user experience is very good, There are no performance issues during high concurrency periods.
-
Code reusability. In the past, almost every business system had scattered user logic codes and copied codes everywhere. After splitting the user pass microservice, the business system only needed to call the user pass microservice interface.
- User data consistency, previously due to access and modify the user data code scattered in various business systems, often can produce some users dirty data, and it is difficult to query in which system changed the user data, at the same time due to the different developers to develop different business system maintenance, also to maintain user data consistency brought a big challenge, After the separation of user pass microservice, all functions related to user logic are provided by user pass microservice, which ensures the consistency of interface for modifying data and obtaining data.
- User data is decouple. The original service system often joins user tables to obtain user data, which is difficult to split. After splitting microservices, user databases are designed and deployed independently, facilitating capacity expansion and performance optimization.
- Project configuration includes project name, administrator, project member, SVN/Git address, account, Shell for service startup, custom script, JVM configuration for different environments, Web container configuration, etc.
- After being configured according to the project, the online application can be initiated. After being approved, deployment can be done with one click.
- Support grayscale release, you can select the grayscale server for version release, to ensure the security and stability of version release;
- Logs generated during the deployment process can be collected in real time to visually monitor problems generated during the deployment process.
- For release exceptions, we have release exception processing mechanism. In the case of multiple servers, we can choose to stop publishing as long as there is failure, that is, one server fails to publish and the rest servers stop publishing later, or we can choose to continue publishing regardless of failure.
- Fast rollback, in case of a release exception, we support fast rollback to the last stable release.
-
Development environment for r & D personnel to use in the development and debugging stage;
- Test environment, which is deployed to test personnel for acceptance after completing all function development and testing in the development environment;
- Pre-release environment, after complete the function of the test environment acceptance, a preview function before release to production environment environment, production environment and share the same database, caching, MQ message queue, etc., used in micro service online before production, confirm whether there are bugs and other issues, will not affect the production environment of users, eventually used to ensure the success of the online production environment;
- The production environment, or online environment, is user-oriented online environment.
-
Disconf distributed configuration management platform realizes unified configuration publishing. All configurations are stored in the cloud system, and users can publish and update configurations on the platform in a unified manner. When changing configurations, there is no need to repackage or restart microservices, but can directly modify configurations through the management platform. All configuration information is encrypted to prevent sensitive information such as accounts and passwords from being leaked.
-
Elastic-Job The Elastic-Job distributed task scheduling platform provides features such as a scheduled task registry, task fragmentation, flexible capacity expansion, failover, task stop, recovery, and disabling of task running servers, facilitating scheduled task management in a distributed architecture.
-
Fully implemented distributed deployment architecture, the system and microservice components are very easy to scale;
- Taking service as the center, the microservice component is built comprehensively.
- The system, microservice components, cache, MQ message queue and DB all have no single point of risk, and all have realized HA high availability.
Future – Baychat Architecture Evolution V4.0
- Docker container deployment, Docker has the advantages of lightweight, rapid deployment, application isolation and cross-platform, microservice is very suitable for rapid deployment combined with Docker. Although we have realized the microservice architecture at present, rapid and elastic expansion has not been achieved, if the combination of microservice and Docker container, Rapid and elastic expansion can be realized. The server can be rapidly and automatically expanded in business peak periods, and the server can be automatically reclaimed in business peak periods. Next, we will implement Docker container deployment of microservice components.
- Unified API gateway. At present, our core API is only a unified proxy layer, which does not have the functions of gateway such as identity authentication, anti-packet replay and anti-data tampering, service authentication, traffic and concurrency control, etc. After the implementation of API gateway, the front and back ends can be separated and convenient monitoring, alarm and analysis can be provided. It also provides strict permission management and traffic restriction to ensure API security and stability. Next we will implement unified API gateway control;
- Deployment across the IDC room, at present our deployment system or a single room, single room does not have redundant and disaster mechanism, first of all, we will gradually implement deployment, more room at the same place and to have the ability to deploy more room and single room avoids failure, finally, we implement different deployment across the IDC room again, achieve different redundant high availability and the purpose of user access to the nearest.