High-performance Architecture (Performance) for the Site

Different perspectives of website performance has different standards, but also different means of optimization. 1. User perspective web site performance optimization by optimizing the HTML page style, using the browser concurrency and asynchronous nature (that is, users don’t have to wait for the result), adjust the browser cache strategy, use the CDN services, reverse proxy, so as to realize optimization application and architecture, also can make the browser display of interest to the user as soon as possible. 2. Site performance optimization from a developer perspective

  • Use caching to speed up data reads
  • Use clustering to improve throughput
  • Use asynchronous messaging to speed up request response and implement peak shaving
  • Use code optimization to improve application performance.

3. Website performance optimization from the perspective of operation and maintenance personnel

  • Build and optimize backbone networks
  • Use cost-effective custom servers
  • Optimize resource utilization by leveraging virtualization technologies

Performance test metrics include response time: The time it takes for an application to perform an operation, including the time from sending a request to receiving the final response data. Concurrency: The number of requests that the system can handle at the same time, reflecting the system’s load capacity (running requests). Throughput: indicates the number of requests processed by the system in a unit time. It indicates the total number of requests processed by the system in a day. Performance counter: describes the performance counters of the server or operating system. You can run the top command to view the counters on Linux.

Methods of performance testing

Performance testing: Applying constant pressure to the system to verify that the system can meet performance expectations within the acceptable range of resources.

Load test: Continue to apply pressure to the system, if a resource becomes saturated, this time the system’s processing capacity will decline.

Stress test: To obtain the maximum stress tolerance of the system by continuing to apply pressure beyond the safe load until the system crashes or can no longer handle any requests.

Stability test: Under the specific hardware, software and network environment, the tested system is loaded with certain service pressure to make the system run for a long time, so as to test whether the system is stable.



The purpose of performance test is to find the optimal running point, the maximum load point and the breakdown point of the system, so as to reasonably choose the server deployment mode.

Web front-end performance optimization

A. Browser access optimization 1. Reduce HTTP requests: reduce the number of requests by merging CSS, JavaScript, and images. 2. Use the browser Cache: Set the browser Cache life cycle by setting cache-Control and Expires attributes in the HTTP header. When updating the cache of a static resource, a volume-by-volume update approach should be used. 3. Enable compression: Compress files on the server and decompress files in the browser to reduce the amount of data to be transmitted. 4.CSS at the top of the page, JS at the bottom of the page: Because the browser doesn’t render the entire page until all the CSS has been downloaded, the browser does it immediately after loading the JS. 5. Reduce Cookie transmission: Cookies because they are stored in the browser and are included in every request and response, and are meaningless when accessing static files.

B. Use CDN acceleration in isp server

C. Use a reverse proxy server in front of the web server and use its caching module to cache requests

Application server performance optimization

A. Use caching (First law of Web performance optimization: Use caching to optimize performance first.) Caching is putting data in the nearest location to a calculation. The essence of the cache is an in-memory hash table. In web applications, the data cache is stored in the form of key-value pairs in the in-memory hash table.

Cache avalanche a cache avalanche is a cache set that expires during a certain period of time. Solution: Set different failure periods for different records based on service characteristics.

If the newly started cache system does not have any data, and the system performance and database load are not very good in the process of rebuilding the cache data, it is best to load the hot data when the cache system is started, such as some metadata – city name list, category information, etc.

Cache penetration Problem Cache penetration refers to querying data that does not necessarily exist in a database. If have maliciously attack, can use this loophole, cause pressure to database, crush database even. Even with UUID, it’s easy to find a KEY that doesn’t exist and attack it. Workaround: If the object queried from the database is empty, put it in the cache, but set the cache expiration time to be shorter, such as 60 seconds

Cache breakdown problem Cache breakdown refers to the fact that a key is very hot, and a large number of concurrent requests are concentrated on this point. When the key fails, the continuous large number of concurrent requests will Pierce the cache and directly request the database. Solution: Set a long life cycle or never expire for hot data.

Concurrency contention in the cache When multiple clients concurrently write a key, the data that should have been received first may arrive later, resulting in incorrect data versions. Solution: Optimistic locking of CAS class (redis transaction mechanism)

Distributed cache frameworks There are generally two types of distributed cache architectures

  • JBoss Cache is a distributed Cache that requires update synchronization
  • Distributed caching represented by Memcached that doesn’t talk to each other (the mainstream)

B. Asynchronous operation (anything that can be done later should be done later) Through asynchronous processing, the transaction messages generated in a short time with high concurrency are stored in the message queue, thus shoring the peak of concurrent transactions. The consuming end of the message queue can be either an application or directly a database.

C. Distribute concurrent requests to multiple servers for processing with the help of load balancing servers to avoid slow response of single server due to excessive load pressure.

D. Code optimization Code optimization usually involves the following aspects:

  • Multi-thread city: Because of IO and multiple cpus, you can maximize CPU resources through multi-threading.
  • Reuse: Use singletons or object pools to reuse resources such as database connections, network communication connections, threads, and complex objects.
  • Data structures: A flexible combination of data structures to improve data read, write, and compute characteristics can greatly optimize program performance.
  • Garbage collection: Helps with program optimization and parameter tuning, as well as writing memory-safe code.

Storage Performance Optimization

A. Storage media: mechanical disks and solid state disks b. Read/write algorithm: B+ tree and LSM tree C. Access technologies: Raid and HDFS