preface

In an interview several times ago, the interviewer saw that I had worked as a Python crawler/back end and asked me some questions about the back end: what do you think is a back end?

Proposition. At that time brain watt, answer: logic processing and data add delete change check…

Redis
Elasticsearch
DNS

In this article, I’ll try to summarize what the front end needs to know about the introduction to the backend architecture.

Whatever your motivation, there’s something in the system that you want to know or learn:

  • How do storage and services fit together?
  • When (or why) do I need this?
  • What’s the path to full stack?
  • Mainstream framework selection for each technology

Contents of this article:

  1. Web / Application Servers
  2. Load Balancer: Indicates the Load Balancer
  3. Domain name resolution system, DNS
  4. HTTPS/SSL certificate
  5. Database, Database
  6. Blob/file storage
  7. Content Delivery Network (CDN)
  8. Caching Service: Caching Service
  9. Message queue: Message queue

1. Web / Application Servers

  • Web ServersServer: Web server, usedhttpProtocols provide content to the Web.
  • Application Servers: Application server that hosts and exposes business logic and processes.

1.1 Server-side language

  • For example,Node.js, Python, PHP, Java, C#orRuby.
  • Each language has its own “Web framework” (such as java-based Spring, Ruby based Rails, C# based ASP.NET MVC or node.js based Express).
  • These frameworks enable developers to write less code to handle data requests.

1.2 Back-end language selection

In fact, each back-end language has different features, and each has its own champion. The question of which language is best used as an introduction to a back-end language has always been an open question. But to give you a simple idea of each language, here’s a list of the most frequently mentioned features, development criticisms, and what kind of web sites have been developed in each language:

PHP:

  • With many users, it is the most popular back-end language.
  • Easy to learn, but criticized for some ancient designs.
  • Example sites:Facebook,WordPress, Sina Weibo.

Java:

  • Old language, exploit the ruler. Stable work demand at home and abroad, wide application.
  • Development is relatively slow and less suitable for beginners.
  • Example sites:Linkedin,AmazonAnd taobao.

Ruby:

  • Developed quickly, many bootcamps at home and abroad teach the backend in this language.
  • Whether it is suitable for beginners is controversial.
  • Example sites:Airbnb,Twitter.

Python:

  • The grammar is easy to learn, data analysis and data exploration related applications.
  • Using Python alone runs poorly.
  • Example sites:Instagram,RedditAnd on zhihu.

JavaScript (Node.js):

  • JS can be used on both the front and back ends, and the execution efficiency is extremely high in the case of high concurrency
  • Not suitable for CPU-intensive applications
  • First choice for start-up enterprises
  • Example sites:Yahoo,Walmart

Go:

  • GooglePush, there is a very complete standard library, powerful as the C series.
  • At present, there are few learning resources (thanks to the great B station, it is very fragrant).
  • Example sites:Google,Youtube, Bilibili, Toutiao, Tencent Cloud

1.2 Web Server

Web Server
Apache HTTP Server
Nginx

  • Quickly redirect some requests without having to do so through backend code (status code 404 page).
  • Static content stored on the Web server’s file system (such as images,CSS.JS) is faster than accessing it through back-end code.
  • Some server-side languages (e.gPHPThere is no built-in production gradeWebServer, therefore need to pass dedicatedWebThe server process is started.

At this point, the question arises: What is the difference between Apache, Nginx, Tomcat, and Node.js?

Reference: Apache, node.js, nginx, tomcat

It’s the same thing, it’s not the same thing.

The Web server

  • TomcatOnly withJavaTo cooperate,Node.jsOnly withJavaScript.
  • ApacheAbility to work well with other languagesPHPMostly), but with the help of different modules.
  • NginxIs forwarded through the port, soApacheandNginxCan be used with a variety of programming languages
  • NginxandApacheIs purewebThe server does not have the ability to parse dynamic languages such as PHP files and JS files.
  • TomcatandNode.jsBeing able to parse these scripting languages, provide application services,Web ServerIt’s kind of an added feature.

1.3 Web Server Form (Carrier)

The Web server computer on which these tools and back-end projects are installed can itself take the following forms:

  • A physical machine
  • Virtual private server, that is, we usually say VPS (such as Huawei Cloud, Ali Cloud, etc.)

A VPS is actually a separate server divided into several parts, each of which is sold and used as a separate VPS server. That is, it is a relatively independent machine that can run multiple Web applications (websites, software, and so on), with each user owning part of the resources.

  • Hosting virtual machine instances (for example AWS EC2, Google Compute Engine)
  • Platform as a Service (PaaS) hosts, cloud service providers (e.g. Heroku, AWS Elastic Beanstalk)

VPS is based on the virtualization technology of the software layer, specifically, it is the virtualization of the operating system. VM is based on the virtualization technology of the hardware layer, and the VM host is built using vmware Server.

1.4 Dokcer, virtual machine and physical machine

What’s the difference between a Docker container and a virtual machine?

To minimize this, let’s use an analogy:

1. The physical machine looks like this:

2. The VM looks like this:

3. Dokcer:

2. Load Balancer:Load Balancer

Load balancing is a high availability network infrastructure as a key component of the load balancing, we usually can be more than our application server deployment, then through load balancing will be distributed the user’s request to a different server is used to improve the site, application, performance and reliability of the database or other service.

Load balancer models typically fall into two categories: Layer 4 (transport) and Layer 7 (application).

Layer 4 (Transport Layer) :

  • Operate according to data in the network and transport layer protocols (IP, TCP, FTP, UDP).
  • Do not know HTTP protocol, corresponding to other TCP applications, such as C/S based development of ERP and other systems.

Layer 7 (Application layer) :

  • Distribute requests based on data in application-layer protocols such as HTTP.
  • Understanding of HTTP protocol, so its application scope is mainly a large number of websites or internal information platform based on B/S development system.

Load balancers are classified into hardware load balancers and software load balancers.

  • Hardware load balancer: Corresponds to layer 4, such as F5 load balancer
  • Software load balancing: corresponds to layer 7, such asLVS,NginxandHAproxy

Both types of load balancers receive requests and distribute them to specific servers based on configured algorithms. Some industry standard algorithms are:

  • Round robin scheduling,Round robin, RR
  • Weighted polling,Weighted round robin, WRB
  • Minimum number of connections,Least connections
  • Minimum response time,Least response time

Using a load balancer in a Web application has two main benefits:

  • It does this by ensuring that a singleWebThe server is not overwhelmed by all the requests to help maintain consistent response times, so processing each request is relatively slow.
  • It maintains high availability. If the server crashes, all subsequent client requests will still succeed because they will be routed to a healthy server and the user will not find any problems.

3. Domain name resolution system,DNS

When the user enters a URL in his address bar, the browser gets the domain part of the URL (for example, www.google.com) and calls DNS. DNS resolves the IP address (for example, 172.217.23.4) sent back to the server. Once it has an IP address, it can send actual requests for web pages.

  • If your Web application uses a load balancer, configure the domain name to point to the domain name or IP address of the load balancer.
  • If you are not using a load balancer, you can point the domain name directly to the domain name/IP address of the application server.

Most Internet domain name registration services (e.g. GoDaddy, Wanwang, etc.) provide a DNS administrative console. These allow you to configure domain names (and subdomains) to point to the location of the application.

If you wish, you can also transfer your DNS server to a cloud provider such as Ali Cloud, Tencent Cloud and manage it from there. This has the benefit of keeping all the application environment configuration in one place and making it easier to automate.

4. HTTPS / SSLcertificate

If you’re building a Web application (or static website), you need to provide services over HTTPS to ensure secure communication between the user and the server. Using HTTPS now also has SEO benefits, so there’s no reason not to use it.

This means that SSL certificates need to be installed on the back end. Specifically, they need to be installed on any server, which is the first point of contact for client requests. This usually means load balancers and CDN servers, but if you’re not using a load balancer, it could also be an application server.

  • You can useLetsEncryptGenerate certificates for free.
  • If you are using a cloud infrastructure, you can use a hosted service, for exampleAWS Certificate Manager. This allows you to create and automatically renew SSL certificates and distribute them to application servers, load balancers, and CDN servers.
  • Only medium to largeHTTPSThe certificate authorization center is recognized by the browser. Otherwise, it is displayed as insecure and needs to be trusted manually.

Currently, SSL certificates are classified into three types based on the authentication level

  • Domain-name SSL certificate, DV SSL for short
  • Enterprise SSL certificate, OV SSL for short
  • Enhanced SSL Certificate, EV SSL for short.
  • They all have different levels of certification and are suitable for different types and sizes of site installations.

5

Almost all Web applications need to keep data somewhere. In most cases, somewhere is some form of database. The main job of a database is to reliably store data in permanent storage and allow retrieval of data by query. It can also enforce some rule constraints around the data structures it stores.

5.1 Types of Databases

There are three popular database models in the early stage, which are hierarchical database, network database and relational database.

In today’s Internet, the most commonly used database models are mainly two kinds, namely relational (SQL) database and non-relational (NoSQL) database.

  • Relational databases (e.gMySql, Postgres, SQLServer, Oracle, SQLite) has been around for more than 40 years and remains the backbone of most Web applications.
  • In the last decade or so, NoSQL databases (such as MongoDB, Cassandra, CouchDB, DynamoDB) have become increasingly common in Web applications, mainly because of their scalability advantages and flexibility in data structures.

5.2 Database Deployment

You can host a database on one server, but it is more common in production scenarios to host it on some form of cluster of two or more servers. This ensures that the database is highly available and reduces the risk of data loss, for example, if a server’s storage becomes corrupted.

In recent years, a handful of cloud-hosted “serverless databases” have become available. These are databases that can be called via the API, but you don’t need to set up a server to host them. In addition to handling things like automatic backups, the cloud provider does this for you invisibly. Examples include DynamoDB (NoSQL), Firebase Real-time Database (NoSQL), and Aurora serverless (relationship).

5.3 Basic Database Solution

Source: Architecture Design “Database From active/Standby to Active High Availability Solution”

No matter the underlying is relational database or NoSQL database, whether Mysql, Redis or MongoDB, they are all the same in terms of architecture design.

There are three basic scenarios for database servers:

  • One master and one standby architecture (Master and Standby)
  • One master and one slave architecture (master-slave)
  • Master/slave architecture (Master/master)

1. One master and one standby architecture (master and standby)

The active/standby architecture is the simplest type of two-node deployment. Almost all database systems in the market have the active/standby function.

The idea is also remarkably simple:

  • The database is deployed on two machines. One machine (code A) serves as the machine that provides daily data read and write services and is called “host”.
  • The other machine (code name B) does not provide online services, but synchronizes data from the “host” in real time, and is called the “standby machine”.
  • Once the “host” is faulty, manually kick the “host” offline and change the “standby” to “host” to continue to provide services.

The pros and cons of this architecture are clear. The pros are that it requires very little development, supports a wide range of databases, is easy to deploy and maintain, and introduces no additional system complexity or bottlenecks.

But shortcomings, is when the “host” failure, need to manual intervention ah, operation and maintenance students are very hard, and the treatment is not necessarily in time. Another disadvantage is that the active/standby architecture can cause serious waste of resources. After all, a “standby” machine with the same configuration as the “host” needs to be kept on standby for a long time, but it is not used as an online service.

To solve this resource waste problem, we had to think of a way to use the “standby machine” as well: a master-slave architecture.

2. One master and one slave architecture (master-slave)

The master-slave architecture is roughly the same as the master-slave architecture described above. The difference is that the active and standby “standby machine” usually does not work, mainly play a backup role. And master from the type of “standby machine” to “slave machine”, usually also to provide services, with the “host” at any time with the engraving of the work.

  1. The “slave” in the master-slave architecture also provides services at any time, but it only provides “read” services, not “write” services.
  2. The “host” will synchronize the online data to the “slave” in real time to ensure that the “slave” can normally provide read operations.
  3. Compared with master/standby, this architecture is a kind of resource saving, after all, “slave” is also providing services, there is no white waste. And in the “host” failure, before manual intervention, at least “slave” is also able to provide data “read” operation, after all, most of the business is “read” more “write” less, so the stability has been improved a level.
  4. The disadvantage is that the architecture is a little more complex, after all, “host” and “slave” both have “read” services, so the front-end business system needs to use a certain strategy to determine which route to read data. In addition, there is the delay problem. Data synchronization from the “host” to the “slave” will inevitably have a certain degree of delay, which may have a certain impact on the services requiring high real-time data.

3. Master/slave architecture (Master/master)

A master-slave architecture is one in which two machines are both hosts and slaves to each other. Both machines provide full read and write services, so there is no need to switch. The client picks one at random when it calls, and when one goes down, the other continues to serve.

  • One of the complexities of the master/slave architecture is that both hosts accept write data, so the latest data needs to be synchronized to each other in real time, and the data needs to be bidirectional replication between the two hosts.
  • However, bidirectional replication inevitably brings data delay and even data loss in extreme cases to some extent.
  • In actual services, some business data have very high requirements for consistency and cannot accept data delay or loss. Therefore, such services are not suitable for the master-slave mode, such as financial services.
  • But most of the scenes in our Internet business still do not have such high requirements, so this mode is used quite a lot for the general scene.

As for the database cluster scheme, I temporarily do not understand, do not write…

6. Blob/ File storage

While databases are usually used to store dynamic data (for example, generated by end users or API clients), there are certain categories of data (unstructured data) that cannot be changed by users or are based on files that are not suitable for database storage, such as:

  • Front-end website resources, such as images,Javascript.CSS, fonts, audio, video files.
  • Files uploaded by users through forms.

Cloud Service providers do not store these services in databases, but provide dedicated services to store these services, such as AWS Simple Storage Service (S3), Azure, Google Cloud Storage and Ali Cloud OSS.

The benefit of this is that the cloud provider can store the files securely and make redundant copies of them to minimize the risk of data loss.

6.1 About Blob Storage:

Blob storage is used for:

  • Provides images or documents directly to the browser.
  • Stores files for distributed access.
  • Stream video and audio.
  • Write to a log file.
  • Stores data for backup and restore, disaster recovery, and archiving.
  • Stores data for local or Azure hosted services to perform analysis

Content Delivery Network (CDN)

The Blob/file storage service allows clients to access files through HTTP endpoints. For example, HTML tags for your Web application can simply be linked to the urls of image and CSS files stored in AWS S3. Traditional Network access:

But let’s say my user is in China and my S3 storage is in the western United States – data is transmitted thousands of miles away, so my user will see delays.

What is a CDN? What are the advantages of using CDN?

  • A CDN is a service provided by a cloud provider that has “edge servers” distributed around the world.
  • These edge servers get copies of files from the “origin” (for example, bloB/file storage location). Your front-end Web application will point to its CDN URL, rather than to the Blob store URL for the static asset.
  • Instead of thousands of miles round trip, the distance between the client and the “edge” is now much less, and file retrieval is therefore faster.

Site access using CDN:

7.1 CDNworkflow

Through the authoritative DNS server to achieve the optimal node selection, through the cache to reduce the pressure on the source site.

8. Cache Service:Caching Service

While a CDN is a form of caching for static files, a Web application may need to temporarily cache dynamic data.

For example, suppose you have a database query that performs calculations on yesterday’s data, and the results are frequently accessed by thousands of users every day. It makes no sense to contact the database every time a user requests this data.

The solution to this is to use a caching service to store the results for a period of time after the first user request. Subsequent requests for that data are provided more quickly through caching.

A cache service is essentially a special type of database. The cache takes the form of a key-value store, where the key is a string that the application code uses to query the data (for example, dailysitestatS_2018-10-17) and the value is the actual data cached. Cached data is usually kept entirely in memory, which makes retrieving data from the cache very fast.

Common caching services are Redis and Memcached. AWS offers managed versions of both through its Elasticache service.

8.1 RedisandMemcachedcontrast

Redis and Memcached are both major open source in-memory data stores. While they are both easy to use and provide high performance, there are important differences to consider when choosing an engine. While Memcached is designed for simplicity, Redis provides a wealth of functionality that allows it to be used in a wide variety of use cases.

Memcached Redis
Submillisecond delay is is
Developer usability is is
Data partition is is
Multilanguage support is is
High-level data structure is
Multithreaded architecture is
The snapshot is
copy is
Publish/subscribe is
The Lua script is
Geospatial support is

Submillisecond delay:

Both Redis and Memcached support submillisecond response times. By storing data in memory, they can read data more quickly than disk-based databases.

Usability for developers:

Both Redis and Memcached are syntactically easy to use and require minimal code to integrate into your application.

Data partition:

Both Redis and Memcached allow you to distribute data between multiple nodes. This allows you to scale out to better handle more data as requirements grow.

Support for a wide range of programming languages:

Both Redis and Memcached have a number of open source clients for developers. Supported languages include Java, Python, PHP, C, C ++, C #, JavaScript, Node.js, Ruby, Go, and more.

High-level data structures:

In addition to strings, Redis also supports lists, collections, ordered sets, hashes, bit arrays, and more. Applications can use these more advanced data structures to support a variety of use cases. For example, you can easily implement game leaderboards using the Redis sorting set, which keeps a list of players sorted by their rank.

Multi-threaded architecture:

Because Memcached is multithreaded, it can use multiple processing cores. This means that you can expand the computing capacity to handle more operations.

Snapshot:

With Redis, you can keep data on disk using an instant snapshot that can be used for archiving or recovery.

Copy:

Redis allows you to create multiple copies of the Redis primary database. This allows you to scale database reads and have high availability clusters.

Publish/subscribe:

Redis supports Pub /Sub messaging using pattern matching, which you can use for high performance chat rooms, live comment streams, social media sources and server interoperability.

The Lua script:

Redis allows you to execute transactional Lua scripts. Scripts can help you improve performance and simplify your application.

Geospatial support:

Redis has specialized commands for processing real-time geospatial data on a large scale. You can perform operations such as finding the distance between two elements, such as people or places, and all elements within a given distance of the lookup point.

9. Message queue

Suitable for asynchronous message sending and receiving of batch tasks and detached applications

Sometimes your program needs to perform tasks that are not directly related to responding to user requests.

For example, suppose a user uploads a video that needs to be encoded and watermarked. But this is a long-running task, so it makes no sense to make the user wait until it’s finished. A better approach is to do this asynchronously. Your web application code creates a job message in the queue and informs your users that they will receive an email (message) when the watermark video is ready.

You will then have a flow of work tasks that can perform the following:

  1. Read a message from a queue.
  2. Start working on the video.
  3. When you’re done, save the encoded copy of the video.
  4. Send a notification email (message) to the user.
  5. Deletes a message from the queue.

There are two architectural components:

You can implement worker tasks in the following ways:

  • schedulingCRONJob to trigger the specified code installed on the application server to be read from the queue on a specific schedule.
  • Is used when a message is added to a queueFaaSPlatform calls worker code.

9.1 Introduction to Message Queue

Message queues are an asynchronous mode of inter-service communication suitable for serverless and microservice architectures. Messages are stored on the queue until they are processed and deleted. Each message can be processed only once by one user. Message queues can be used to separate heavyweight processing, buffering, or batching work, and to ease peak workloads.

Now commonly used MQ components have activeMQ, rabbitMQ, rocketMQ, zeroMQ and in recent years the hot Kafka, from some scenarios is MQ, of course, Kafka more powerful, although different MQ has its own characteristics and advantages, but, no matter what kind of MQ, Both have some features that come with MQ itself.

9.2 MQ Features

features instructions
Push or pull transmission Pull refers to constantly querying the queue for new messages. Push is when the system notifies the user when a message is available (also known as publish/subscribe messaging). You can also use long polling to have the pull wait for a specified amount of time so that new messages arrive before they are complete.
Timed or delayed transmission Support for setting specific delivery times for messages. If you need to set the same delay for all messages, you can set up a delay queue.
At least one transmission Message queues can store multiple copies of messages for redundancy and high availability, and resend messages in the event of a communication failure or error to ensure that they go through at least one transmission.
Exact primary transmission FIFO (First in, first out) message queues ensure that each message is transmitted exactly once (and only once) by automatically filtering for duplications where duplication is not allowed.
FIFO (First in, first out) queue In these queues, the first entry to be processed is the earliest (or first) entry, sometimes called the “queue head.”
Message priority Typically, you can assign a priority to a message to determine where to add the message to the queue, thus ensuring that higher-priority messages are at the front of the queue and are processed first.

9.3 MQ Application Example

Source: MQ(Message queue) common application scenarios

Our actual scenario is probably an e-commerce system based on micro-service architecture, which can be divided into user micro-service, commodity micro-service, order micro-service, promotion micro-service, etc.

Based on the microservice mode development of the system, MQ use scenarios more. Here are some examples of common applications.

1. Initialization after registration

After registration we may need to do a lot of initialization operations, such as:

  • Call mail server to send mail, call promotion service to give coupons, send user data to customer relationship system, etc.
  • At this time, we will monitor these operations to MQ. When the user is registered successfully, we will notify other services to operate through MQ. Ensure the performance of registered users.

2. Release merchandise in the background

When releasing products in the background:

  • Commodity data needs to be converted from database to search engine data (based onelasticsearch)
  • So we should write the item to the database and then write toMQAnd then through listeningMQTo generate aelasticsearchCorresponding data.

3. Cancellation of payment timeout

If the customer does not pay within 24 hours after placing the order, the order needs to be cancelled.

  • In the past, we might have timed the task to loop the query and then cancel the order.
  • In fact, I would recommend something like deferred MQ, which avoids a lot of invalid database queries, and setting an MQ to 24 hours before consumers consume it, which greatly reduces server stress.

4. Notice upon completion of payment

  • After the payment is completed, it is necessary to timely notify the subsystem (purchase-sales-inventory system delivery, user service credits, SMS sending) for the next step.
  • However, payment callbacks are required to ensure high performance, so we should modify the database state directly to MQ and let MQ notify the subsystem to do other non-real-time business operations. This can ensure that the core business is efficient and timely.

disclaimer

Visit foreign community to see this article, think quite concise and clear.

Just feel fun, according to its outline, rewrite summary, there is a wrong place to bear more.

It means it’s a little rough, don’t spray me…

❤️ see three things

If you found this post inspiring, I’d like to invite you to do three small favors for me:

  1. Like, let more people can also see this content (collection does not like, is playing rogue – -)
  2. Pay attention to the public number “front-end dispeller”, irregularly share original knowledge.
  3. Check out other articles as well
  • Design Patterns you Inadvertently Use (part 1) – Creation patterns
  • “King of data visualization library” D3.js fast start to Vue application
  • “True ® Path to Full Stack” a back-end guide to Web front-end development
  • “Vue Practice” 5 minutes for a Vue CLI plug-in
  • “Vue practices” arm your front-end projects
  • “Advanced front end interview” JavaScript handwriting unbeatable secrets
  • “Learn from source code” answers to Vue questions that interviewers don’t know
  • “Learn from the source code” Vue source code in JS SAO operation
  • “Vue Practice” project to upgrade vue-CLI3 correct posture
  • Why do you never understand JavaScript scope chains?

You can also go to my GitHub blog and get the source files for all the posts:

Front end exit guide: github.com/roger-hiro/…