Rome isn’t built in a day, and architecture isn’t built overnight. It’s the constant cycle of demand-reconstruction-launch that makes architecture beautiful or bad.

    I have been engaged in IT development work for 8 years. I have participated in the development of more than 10 projects and led the architecture design of several Internet projects, mainly e-commerce or e-commerce related projects. From the beginning, I had no way to start, but now I am familiar with the road. Read some architecture related books, the thinking of the architecture in the book, implementation process and methods and their process of architectural design difference is very big, I doubt whether the contact level is too low, ha ha), so I want to put some of his own experience in order to summarize, as a kind of precipitation, on the other hand also can study with all of you to communicate with each other, Only by knowing your limitations can you make greater progress.

    1. What is architectural design

    “Architectural design is the product of people’s subjective mapping of elements within a structure and their relationships. Architectural design is a set of related abstract patterns that guide the design of various aspects of large software systems.” From Baidu Baike. The definitions in the data are accurate, complete, and written, and it is still difficult to understand the nature of architectural design. In a popular description, architectural design is like the process of solving word problems in the primary school exam, but the problems to be solved are more complex, the process of designing is larger, and the workload of solving problems is larger.

    2. Project quality indicators: functionality, performance and scalability

    The ultimate goal of software development is to use code to implement abstract business logic, which can be measured in three dimensions: functionality, performance, and scalability.

    Function: function goal is the basic requirement of the application. If the established function logic cannot be realized, the application will lose the meaning of existence, so the realization of product demand is the basic goal of the application.

    Performance: There are performance requirements on top of basic functionality, but few product managers or users can ask for them in advance, so the architect should have extensive experience identifying and solving (or preparing for future performance improvements) performance problems. The main measures of performance are: the corresponding time of a single request, the number of concurrent requests of a single instance, the maximum concurrency of the service, etc.

    Expansibility: Current development mode of Internet application: rapid response, iterative development; Demand, quick response, as soon as possible line, is the mule is pulled out. So this requires the architecture design of the system to better respond to new requirements and requirements change.

    3. Main process of architecture design

    More than 3 years of architecture experience in several projects, my own architecture design process is: problem domain determination, data modeling, module division, key process description, technology selection, code implementation, acceptance test

    3.1. Determine the problem domain

    Remember the primary school exam to get the teacher to change the paper, to a Red Cross will be upset: “Ah, and look at the wrong topic”. The wrong direction does more harm than the wrong method. Without the right direction, the project will go in the opposite direction and deviate far from the target. The requirement description from the product manager or the user is our problem domain, but the requirement description from the product manager is more comprehensive and full of content, and the requirement description from the user is simple and relatively vague. For example, the requirement document of e-commerce project will be very large, and when we get a requirement document of dozens or hundreds of pages (some programmers say that my product is only one paragraph, “Let’s copy the functions of XXX website”, haha), we often don’t know where to start, so we need to find the key problems in the complex problem domain.

    A. Users can purchase XXX products on our website.

    Starting from article A, the following questions extend:

    B. User access: user login and registration

    C, commodity source: commodity management, increase, deletion, change and check, etc

    D. Transaction process: order management, etc

    So if we start with the D deal, we can extend it

    E. Payment by users: payment

    F. Merchant distribution: receiving address and distribution process

    By expanding the problem domain, you can turn the whole process around. Of course, in practical application, we will not sum up all the problem areas. After determining the key problems, we can start data modeling.

    We analyzed the functional problem area above, which will also need to identify the performance and scalability problem area. Performance problem domains should be identified for critical paths, such as item browsing, order creation, order payment, etc. For these critical path problems, you can define problem domains such as a single instance that supports 1000PV/SEC item browsing, 100ps/SEC order submission, etc.

    Scalability is the most difficult to grasp, because everyone has different experiences and different expectations of future requirements for the same project. Therefore, how to balance current functions and future changes, and how to balance performance and scalability is the key to architect design. In my experience, extensions focus on my key issues, prioritizing the performance of key issues, and determining the minimum feature set. The advantages of identifying a minimal feature set can be realized quickly, the accuracy of requirements can be verified quickly, and the smallest and most critical requirements can be completed with each requirement development. Design to meet some ideas and principles, OOP(object-oriented design) principles :1, single responsibility principle; 2. Open and close principle; 3. Richter’s substitution principle; 4. Dependence inversion principle; 5. Interface isolation principle; Database design three paradigms and so on. You can also consult friends or communicate with experienced product operators to get a general idea of the extensibility. E-commerce projects may include: panic buying, reservation, group purchase and other businesses are some of the expansion needs of e-commerce.

    3.2. Data Modeling

    Once you’ve identified the problem domain, you can start answering questions. Determine the data model. Most applications basically use relational databases, so we create data tables for the problem domain first. Of course, there are some projects that use NoSQL to store or persist data, and here we determine the entity class of the problem domain. Each name of the problem domain described in 3.1 is a data table (or entity object), user, item, order, payment flow, shipping address, shipping order.

    I prefer to use PowerDesigner to make database model, you can see the table structure intuitively, easy to modify, can generate most of the DB DDL SQL. Create a table structure for the nouns (entity structure) found above, and then read the product requirements document one by one to determine whether the current table structure can meet the requirements. If not, add columns to the table or add new tables to meet the requirements, and continuously enrich the table structure until it fully meets the requirements. Of course, in the process of modeling, the original table structure will be adjusted. After all, the continuous increase of demand will cause changes in the data model, so the initial establishment must be incomplete, and constantly adjust until all requirements are met.

    Data modeling tips:

    CreateTime createTime updateTime updateTime and version number version

        

    Self-added ID: primary key. Query or update by ID is fast.

    CreateTime and updateTime record the creation time and the last updateTime to locate key faults

    Version: Edit version++, is a very convenient optimistic lock, can greatly improve database performance

    B. Do not use foreign keys, which is somewhat contrary to the specification of database design, but it comes from real experience. The advantage of data integrity brought by foreign key constraints is far less than the difficulty of updating logic implementation. In terms of performance and scalability, the advantages of not using foreign keys outweigh the disadvantages. Common methods to provide performance for Internet applications with high concurrency and large traffic of big data are to improve the access speed of database, cache data, and support high concurrency for database sub-tables. Foreign keys are a constraint to these methods.

    C, do not use id as table association, although we do not create a foreign key constraint, but does not mean that there is no association between tables. Therefore, there will still be foreign keys between tables, but there is no foreign key constraint. When designing this foreign key, we should consider the growth type of data. There is no definite size of data, so we can set a data size in the next 3-5 years based on the growth rate. A better choice is to use globally unique ids for the association between tables. There are many algorithms to generate globally unique IDS. It is better to use the database (Oracle sequence and mysql increment ID) to generate globally unique IDS. Of course, you can also use the combination method to add data type, time, region, etc., and also use the UUID algorithm to calculate the method. As long as it can meet the characteristics of non-repetition, the selection of which way can refer to the product’s opinion, because this field users may perceive.

    D. Is there a standard for how many columns in a table? I remember when I first started designing, I often wondered if I had too many tables or too few tables, if too many tables would affect performance, and if too few tables were gild the lily. The application of the initial design is the most consistent with the design principles and ideas, and is not affected by non-design factors such as time limit, team division, and implementation difficulty. Therefore, we should try our best to adhere to the initial design and overcome the influence of other factors.

    E. Are redundant fields necessary? My approach is not to use, keep the original design, if the system really heavy traffic, query performance is too low, you can separate the read service from the business system (note that it is not the database read separation). Separate reading and writing from business, make some design to facilitate query and improve performance, and synchronize data through some data extraction methods. In fact, there is a high tolerance for the data delay of the presentation data. Only the business system needs to achieve data consistency, so the business system reads the data specifically and rarely needs a lot of associated data. Therefore, the initial design of the system should minimize the use of redundant fields to provide the convenience and performance of the query.

    The following is the design of some e-commerce application tables (related to commodity and order business) for reference only:

    3.3 module division

    If the data model is mainly to meet the functional goal, module partition will be more performance and scalability, commonly used Internet application module partition and deployment structure has the following several (discussed here and deployment are divided into Internet application server), regardless of the browser and APP client, introduces several common structure is, of course.

    Single-instance structure:

       

    Advantages: Simple structure, easy to develop and deploy

    Disadvantages: Potential performance bottlenecks, poor scalability, and high system coupling

    Cluster structure:

       

    Note: The DB layer as a whole may have single DB, read/write separation, separate database and table, or data cluster technologies

    Advantages: Great performance (theoretically unlimited)

    Disadvantages: Load balancing is possible, performance bottlenecks still exist, and scalability is poor

    Distributed structure:

        

    Distributed system is a structure in which different services are divided into different instances, functions are aggregated into the same instance, and systems communicate with each other using network protocols.

    Advantages: strong scalability, high cohesion, low coupling

    Disadvantages: complex structure, difficult transaction control, large development workload

    Hybrid structure:

        

    Hybrid architecture is a distributed architecture based on which each module implements cluster architecture, so this mode magnifies the advantages and disadvantages of distributed architecture

    Advantages: strong scalability, high cohesion, low coupling, high robustness

    Disadvantages: complex structure, difficult transaction control, large development workload

    Most of the structure of Internet applications can be described by the above four structures, of course, here is just a simple description, some applications in order to improve database access will add DB cache, in order to improve the speed of page access to do page static and CDN, in order to deal with the storage and retrieval of big data using NoSQL database, etc., But we designed the structure to be unaffected. How to choose the system structure can be considered from the following aspects:

    1, whether the system can meet the growth of the next 2-3 years, if the workload of the system with mixed structure is far greater than that of the system with singleton structure, for start-ups, online is the biggest demand, so it is necessary to give up elegance when spelling speed. If the company has a certain scale and the application is the core business or the core business in the future, it is a better choice to adopt a hybrid structure with strong scalability to meet the needs of the rapidly developing business in the future.

    2, human factors, the mixed structure is more workload than the single instance structure, and has higher requirements on the overall technical level of the team, so we should “do what one can”.

    3, time factor, the construction period of many Internet companies is not technical assessment, is determined by the “market”, so when the fire is up, don’t pay attention to performance and expansion, go online again.

    4. Architecture is not static. Increasing access and changing requirements change the architecture of the system. Quick response and constant iteration are the way to use the Internet. Therefore, the initial architecture, as far as possible to achieve high cohesion and low coupling, so as to constantly improve the shortcomings of the system, gradually improve the system structure.

    Description of distributed structure e-commerce module:

       

    Display layers:

    Front Desk: client interface display layer, dependent on user services, commodity services, transaction services and payment services

    Background: the interface display layer of the operation terminal, the interface for the operation personnel to manage various data. Depend on user services, goods services, transaction services and payment services;

    Service layer: Provides RPC services for the interface

    User services: registration, login, user management, etc

    Commodity services: commodity browsing, inventory display, commodity management, inventory management and so on

    Transaction services: shopping cart service, order calculation, order submission, order list, etc

    Payment service: generate payment link, successful payment jump, successful payment logic processing, etc

    Basic components:

    DB: data store

    Redis: Redis is used to implement user status session mechanism, facilitating cluster deployment in the future. Realize shopping cart function, user shopping cart server persistent, easy for users to cross browser shopping cart management.

    Third-party payment component: Used to interact with third-party payment services

    3.4. Description of key processes

    Key process descriptions are necessary to check whether the system architecture meets requirements and guide development. A key process description is a process that uses a flowchart to solve a key problem. Its users are the rest of the team and themselves, so the format doesn’t matter, as long as others understand it.

    Some writing lessons are as follows:

    1. Finish what you start. The process should be a complete process from the user entering the application to leaving the application, such as the transaction process, from the user browsing products to the user paying successfully.

    2. Highlight the key points in the flow chart. For example, the transaction process mentioned above should highlight the process judgment related to the transaction, and there is no need to describe the process of user registration and password retrieval.

    3, brief explanation, avoid too detailed, for example, in the process of trading needs to update the inventory, but do not need to describe the inventory check before updating the inventory. These are the internal implementation of order submission.

    The following is an example:

       

    3.5. Technology selection

    1. If it is not necessary, please use common technologies and frameworks. There are many users of common technologies and frameworks, so non-business problems will be less encountered. In a project I once participated in, a module was taken charge of by a student familiar with Python. The system was just launched, but due to some reasons, I had to leave, and no one was available, so I had to find another language to develop it again.

    2. What is familiar is better than what is powerful. Try to use technologies that the team is familiar with or that someone on the team can mentor. Five years ago, I made a workflow system for requirements and Bug tracking for Party A. I made a hasty decision to use JBPM. Since no one in the team had studied it, I spent a lot of time using this framework, which was not well used in the end, resulting in a stumbling project.

    3.6 code implementation

    Code is meant to be read by humans first and machines second, so a good code structure is the best medicine for a project to survive longer.

    1. Code layering: The principle of single function. MVC is a basic mode of Internet applications, which is divided into V display layer, C control layer and M business layer from the functional level. Business layering in Java is as follows:

       

    From the top down,

    Controller: the page control layer, used for page entry and exit parameter conversion and page hopping

    Vo: Anaemic entity objects for data transfer at the page and business layers

    Bo: Business implementation layer

    Dao: Data access layer that handles interactions with the database

    Po: anaemic entity object, used for DAO and database transfer, and database table column meaning correspondence

    2. Naming conventions

    Classes, methods, variables, parameters, etc. used in the code shall be named in a unified style, separated by English or Chinese pinyin, and separated by hump or underscore, as far as possible in the common style of the industry. Classes, variables and parameters use nouns, methods use verbs, etc.

    3. Annotation style

    A uniform annotation style is adopted, with line comments generally used within methods and section comments elsewhere.

    3.7 acceptance test

    Actively cooperate with the test team to test the project. They are the ones who escort the healthy launch of the project, not the ones who criticize it.

    4, summarize

    1. There are many solutions to the same problem, and what we do is just one of them. Therefore, we should accept others’ doubts and suggestions, so as to improve the system.

    2, there is no silver bullet, there is no solution to all problems, so it is impossible to use one method to solve all problems, so it is necessary to choose appropriate methods according to the needs, team, time and so on.

    Postscript: I am a person who always doubt myself, so I am not sure whether what I write is valuable or how much value I write at the end of the article. It is a summary of my own experience. If someone actually takes the time to read it and gets something out of it, that’s the biggest inspiration FOR me to write this article.

    Please note: Sun Haojie’s blog » Internet project architecture experience sharing