When I talk about cache, MY heart suddenly becomes clear. Forced by the form of key-value, I always feel that everything is under my control. Like the impulse of the eye, the mind is full of beauty.
Do you know the design principles and effects of caching in terms of website architecture?
omg
In the business world, it is often said that “cash is king”. In the world of the Internet, the mobile Internet and software technology in general, the equivalent is “caching is king”.
Why do you say that?
Imagine if your entire network request (HTTP, SOAP, RPC, etc.) had been cached in some part of its execution, would it have been able to respond to the client in advance?
Why now many large and medium companies in the interview, the application of the cache, principle, high availability and a series of questions, are thrown to you, so that you are difficult to deal with. That’s why.
What is caching?
Cache: A copy of raw data stored on a computer for easy access – Wikipedia
A cache is a key technology in rapid system response, a set of things that are kept for future use. Between application development and system development is where product managers often miss the point, and it is also a non-functional constraint in technical architecture design.
I know. What’s with the system cache, Zach? Take your time. Look behind you
What is multi-level caching architecture?
As the name suggests, a cache project consisting of multiple dimensions. Because caching has different meanings in different scenarios. The technology used is also different.
According to the form of cache existence:
-
Hardware cache (such as CPU, hard disk, etc.)
-
Operating system cache
-
Software cache
What is the system cache?
The operating system is a computer program that manages the hardware and software resources of the computer. The speed of the hard and software parts is basically determined by the cache. The larger the cache capacity is, the faster the corresponding hardware runs. So the system cache is what the operating system does when it calls hardware resources (memory, files, etc.) and applications.
Summary: operating system memory call involves the part of the cache, can be counted as system cache
Software operation is built on the operating system, and programs need to be loaded into memory when running. However, software operation is based on virtual memory mapping mechanism, rather than directly operating physical memory. Virtual memory stores related resources in the form of block tables (tables of memory blocks).
Note: Physical memory is composed of several blocky elements, which are the smallest unit of memory management. Each element has 8 small capacitors that store 8 bits, or 1 byte. Is it similar to disk block, ^_^. Za-za-hui knows you
In order to improve the system access speed, inAddress mapping mechanismAdded a small capacity inAssociative register, namely block table.
It is used to store the number of pages currently visited by the few active pages. When a user needs to access data, the user finds the corresponding memory block number in the block table according to the logical page number of the data, and then associates the page address to form a physical address.
Conclusion: when reading data – > get the logic page – > screen memory block number – > get the physical memory page said the inside address — — — — > physical addresses
If there is no corresponding logical page number in the block table, the address map can still operate through the page table in memory, but it gets a free block number, which must be filled into the free area in the block table. If there is no free space in the block table, a row in the block table is eliminated according to the elimination algorithm, and a new page number and block number are filled in.
I remember that computers get caches on the nearest principle. What’s their priority?
The cache selects the most suitable memory based on the speed of the memory. The closer the memory is to the CPU, the faster the memory is, the higher the cost per byte, and therefore the smaller the capacity
The layers are as follows: register (closest to CPU, register speed is fastest), cache (cache is also hierarchical, such as L1, L2 cache), main memory (common memory), and local disk
Daily development often makes caching software separable according to the location of the software system
- Client cache
- Web caching
- Server side cache
Multi-level caching is like a pyramid pattern. Descending from top to bottom. Acts like a funnel to filter traffic requests. If the vast majority of requests are cancelled out at the part of the client and network interaction, the pressure on the back-end service is greatly reduced.
Why use multi-level caching architecture?
The fundamental is to provide high-performance services for the website, so that users have a better user experience. More performance space at less cost.
Talk about user experience
The term user experience was first widely recognized in the mid-1990s, proposed and popularized by user experience designer Donald Norman.
Due to the progress of information technology in mobile and image processing, human computer interaction (HCI) technology has penetrated into almost all fields of human activities. This led to the expansion of the system’s evaluation metrics from usability to user experience.
In the development of human-computer interaction technology, user experience has received considerable attention, which is equal to the three traditional usability indicators (efficiency, effectiveness and basic subjective satisfaction), and even more important in some respects.
What is user experience?
The ISO 9241-210 standard defines user experience as “people’s cognitive impressions and responses to products, systems, or services that are being used or expected to be used”. Therefore, user experience is subjective and practical.
User experience: refers to all the feelings of users before, during and after using a product or system, including emotions, beliefs, preferences, cognitive impressions, physiological reactions, psychological reactions, behaviors, achievements and other aspects.
ISO standards also imply that usability can be used as an aspect of user experience, “usability criteria can be used to evaluate some aspects of user experience”. However, the ISO standard does not elaborate further on the specific relationship between user experience and system usability. Obviously, the two concepts overlap.
Maybe that’s why the product keeps messing with our technology. I wonder how your product is? There is no da impulse
Factors that affect user experience
Three factors affect the user experience:
- User status
- The system performance
- The environment
System performance is the most critical factor of software product to user experience. As the subject of feeling software performance is human, different people may have different subjective feelings about the same software and have different perspectives on software performance.
System performance is a non-functional feature that focuses not on a particular function, but on the timeliness demonstrated when that function is completed.
About the performance of the system
System performance indicators generally include response time, delay time, throughput, number of concurrent users, and resource utilization.
The response time
Response time refers to the response time of the system to the user’s request, which is consistent with the subjective feeling of the software performance, and completely records the time of the whole system to process the request.
Generally, the response time varies according to the service scenarios in different projects. For example, a request must be within 100ms or 200ms.
How long does it take for your home page to respond?
Because a system usually provides many functions, and the processing logic of different functions is also different, the response time of different functions is also different, and even the response time of the same function is different in the case of different input data.
Therefore, the response time usually refers to the average response time of all functions of the software system or the maximum response time of all functions.
Sometimes it is also necessary to discuss the average and maximum response times for each or group of functions.
When discussing software performance, we are more concerned with the “response time” of the software being developed.
For example, PHP response time is the time it takes to receive a request from Nginx, complete the business process, and then respond to nginx. The user looks at the time it takes to send the request to see the page
The former is the response of the whole software itself, and the latter is the response time of user request. Different viewing angles
In this way, we can divide user perceived response time into presentation time and system response time,
- Rendering time: the time required by the client to render the page when receiving system data, that is, the page rendering loading time,
- System response time:The timer starts from the time the client sends the request until the server responds to give the client the required time
System response time can be further broken down into network transmission time and application latency time.
- Network Transmission time: the time when data is transferred between the client and server
- Application latency: specifies the time required by the system to process request services
In the future, when it comes to optimization, you should start from the entire request link, targeting presentation, network transmission, and application processing time
throughput
Throughput refers to the number of requests processed per unit of time by the system.
Unit time is described in terms of the planned response time of the project itself, but 1s is often used to measure the number of successful requests processed.
How do I calculate the throughput of my website? As a small executive, I have to make up for the class to calculate throughput first depends on your time conversion and traffic situation.
- The unit time division assumes that the AD pages of a publishing system should meet a total of 500 million visits in 30 minutes. So the average QPS is: 500W /(30*60) = 2778, about 3000 QPS /S (to reserve space)
- Assuming that the average daily PV of the home page of an information classification website is about 8000W, the average QPS is: a day is calculated according to 4W seconds (not calculated at night), 8000W / 4W = 2000, about 2000QPS.
Note: users will not use the software all day long, usually not at night or by a small number of users. But also sub business, you like live broadcast, takeout and so on. But a 12-hour day is basically the maximum amount of time a user can spend using the software that day.
The duration of online APP use by specific users is also uncertain, and the average QPS should be calculated according to the user’s use time and total traffic. Peak flow was used to calculate the maximum QPS.
For non-concurrent applications, throughput and response time have a strict inverse relationship. In fact, throughput is the reciprocal of response time.
Non-concurrent applications are stand-alone applications, for products on the Internet or mobile Internet.
Number of concurrent users
The number of concurrent users refers to the number of concurrent users supported by the system. The larger the number is, the stronger the processing capability is.
Resource utilization reflects the average resource usage in a period of time.
From browser -> network -> application server -> database, through the application of caching technology at all levels, will greatly improve the performance of the entire system.
For example, the closer the cache is to the client, the less time it takes to request content from the cache than from the source server, and the faster the rendering, the more sensitive the system appears. The reuse of cached data, which greatly reduces bandwidth usage for users, is also a disguised money-saver (if traffic is paid at all), while keeping bandwidth requests at a low level and easier to maintain.
Therefore, the use of cache technology can reduce the response time of the system, reduce the network transmission time and application delay time, and then improve the throughput of the system, increase the number of concurrent users of the system. The use of cache can also minimize the workload of the system, and the use of cache can avoid repeated search from the data source. The same piece of data created or provided by the cache makes better use of the system’s resources.
Therefore, caching is a common and effective means of system tuning, whether operating system or application system, caching strategy is everywhere. “Caching is king” is essentially the king of system performance, for users is the king of user experience.
Web architecture cache evolution
start
The original site may have been a physical host hosted in AN IDC or rented cloud server, running only the application server and database, which is how LAMP (Linux Apache MySQL PHP) became popular.
The development of
Because the website has certain characteristics, attracted some users to visit, will gradually find that the pressure of the system is getting bigger and bigger, the response speed is getting slower and slower, and this time is often more obviousDatabases and ApplicationsThe application server and database server are physically separated and become two machines. Let them not interact with each other to support higher flow.
Mid –
As more and more people visit the site, the response time starts to slow down again, probably because there are too many operations accessing the database, causing competition for data connections, so caching comes into play.
It is not hard to see that the database is often the first choice to consider optimization, after all, there is no cache request directly connected to the database, and the database is the centralized place of data, data retrieval will involve disk I/O. You can imagine the pressure
If you want to use the caching mechanism to reduce database connection resource contention and database read pressure, you can choose the following:
- Static page caching: This is a great way to reduce stress on the Web server and competition for database connection resources without programmatic changes.
- Dynamic caching: Cache the relatively static parts of dynamic pages, so consider using a similar page fragment caching strategy (dynamic caching via Nginx, Apache configuration).
Static caching favors static resource caching and browser caching. Dynamic cache is a cache file that is generated after a page is accessed and provided for subsequent requests. It’s kind of like a template engine
What about my database?
At the moment traffic is up, mainly for access, and write requests are up, but the performance bottleneck is in the database read operations. It can be said that writing is not much of a threat. If there is not enough writing, multiple instances can only be expanded to accommodate the writing shortage.
Period of high growth
As traffic continues to increase, the system starts to slow down again, what to do?
Data cache is used to load repeated data information from the database to the local system and reduce the load of the database at the same time. As the traffic to the system increases again, the application server becomes overwhelmed and Web servers are added.
So how do you keep the data cache information in sync with the application server?
For example, previously cached user dataStart using cache synchronization and shared file systems or shared storage. After enjoying a period of high traffic growth, the system slows down again.
Database tuning began, optimizing the cache of the database itself, followed by the use of database cluster and separate database and separate table strategy.
The rules of database and table are somewhat complicated. Consider adding a general framework to implement Data Access of database and table, which is the Data Access Layer (DAL).
- Cache synchronization mechanism: Each Web server will save a cache file, so the synchronization mechanism of the cache is needed to complete the data.
- Shared storage: Shared storage is a parallel architecture in which two or more processors share a single main memory, such as Redis.
- Shared file system: The file systems between the two machines can be more closely integrated, so that users on one host can use the file system on the remote machine just as they use the local file system. Like Samba and NFS.
In the late
At this stage, it is possible to find problems with the previous cache synchronization scheme because the amount of data is so large that it is now impossible to store the cache locally and then synchronize it, which can cause synchronization delays and increase response times, data inconsistencies, and database coupling caching. So the distributed cache finally arrived, moving a lot of the data cache to the distributed cache.
What’s wrong with using a shared file system or shared storage?
-
Shared storage: When multiple services access a storage, singleton performance problems and. Concurrent reads and writes may cause cache and data inconsistencies.
-
File sharing: Multiple services are under heavy pressure due to high file I/O overhead. Performance degrades.
In the end
At this point, the system enters the stage of a large website with no scale,When the traffic on your site increases, the solution is to keep adding Web servers, database servers, and caching servers. At this point, the system architecture of a large site evolves as shown in the diagram.
Throughout the history of web architecture, caching has often been a panacea, proving once again that caching is king.
Implementation of client caching
Client-side caches are simpler than other caches and are usually used in conjunction with server-side and network-side applications or caches.
There are two categories of Internet applications.
-
B/S architecture: page caching and browser caching
-
Mobile APP: The cache used by the APP itself
Page caching
What is page caching? Page caching has two meanings:
-
The client caches some or all elements of the page itself. Offline application cache for short
-
The server caches the elements of a static or dynamic page and gives them to the client. Page self cache for short.
Page caching saves previously rendered pages as static files, avoiding network connections when the user accesses them again, thus reducing load and improving performance and user experience.
With the widespread use of Single Page Applications (SPA) and HTML5 support for offline caching and local storage, Page caching for most BS applications is now a breeze.
Example code for using local caching in HTML5 is as follows:
GetItem ("mykey"," mykey") localStorage.removeItem("mykey") localStorage.removeItem("mykey") localStorage.clear()Copy the code
What is a single page application?
SPA is a Web design using a single page, using JavaScript to manipulate Dom technology to achieve a variety of applications. In this mode, a system only loads resources once, and the subsequent operation interaction and data interaction are carried out through routing and Ajax, without refreshing the page.
The common routing format is HTTP :.// XXX /shell.htm1#page1. Generally used under Vue is obvious. Like mall activity page, login page this is a good SPA landing practice.
HTML5 provides offline application caching mechanism, so that web applications can be used offline (no network can also be accessed), this mechanism is very widely supported in the browser, you can safely use this feature to speed up page access. To enable offline cache, perform the following steps:
- Prepare a manifest file that describes the list of resources that the page needs to cache
(the manifest text/cache - the manifest)
.
Note: The manifest file must have a correct MIME-type, that is, “text/cache-manifest”. The configuration must be done on the Web server.
Example: Nginx will modify the mime.types file in the configuration directory to add the manifest file mapping:
text/cache-manifest manifest;
Copy the code
- Add to HTML in pages that need to be used offline
manifest
Property to specify the path to the cache manifest file. The workflow for offline caching is shown in figure 1.
The MANIFEST attribute of the HTML tag is now deprecated and can be moved to Webpack.
As can be seen from the figure:
-
When the browser visits a page containing the manifest attribute, if the application cache is not present, the browser loads the document, retrieves all the files listed in the manifest file, and generates the initial cache.
-
When a subsequent request reaccesses the document, the browser loads the page and the resources listed in the manifest file directly from the application cache. At the same time, the browser sends a check event to the window.ApplicationCache object to retrieve the manifest file.
-
If the currently cached copy of the manifest is up to date, the browser ends the update process by sending an event to the window.ApplicationCache object indicating that no update is required. If you modify any cached resource on the server side, you must also modify the manifest file so that the browser knows to retrieve the resource.
-
If the manifest file has changed, all files listed in the file are retrieved and placed in a temporary cache. For each file added to the temporary cache, the browser sends an event to the window.ApplicationCache object indicating that it is in progress.
-
Once all the files have been successfully fetched, they are automatically moved to the true offline cache and an event is sent to the window.ApplicationCache object indicating that it has been cached. Since the document has already been loaded into the browser from cache, the updated document will not be rerendered until the page is reloaded.
Note: Resource urls listed in the MANIFEST file must use the same web protocol as the MANIFEST itself, as described in the W3C standards documentation.
Browser cache
Browser caching works according to a set of rules agreed with the server that are simple: check to make sure the copy is up to date, usually with only one session request.
Browsers create a dedicated space on the hard disk to store a copy of the resource as a cache.
The browser cache is useful when a user triggers a back action or clicks on a link they’ve seen before. If you access the same image on the system, it can be called up from the browser cache and appear almost immediately.
HTTP1.0
For browsers, HTTP1.0 provides some very basic caching features, with parameters such as:
- Setting Expires in the HTTP header on the server side tells the client how long it is valid to cache the file before rerequesting it.
- Request to use cache with if-modified-since condition judgment.
- Last-modified: Last-Modified is the response header, and the Web container tells the client when it was Last Modified
Each request to the Web container will first check whether the client cache resource is still valid. If not, the last Modified time of the last server response will be sent to the server for judgment as if-modified-since. If the file has not changed, The server uses **304-Not Modified for the response header and an empty response body. ** The client receives a 304 response and can use the cached version of the file.
Why send an HTTP request when the client is invalid to determine if the file has been modified?
If the cache resource is valid then it does read the client cache directly, without sending an HTTP request. There are a few situations to be aware of, such as when a user presses F5 or hits the Refresh button, and even a URI with Expires will send an HTTP request, so last-Modified is needed.
In addition, the client and server time may be different, which will cause the client to expire, but the server does not. If there is a judgment mechanism, then the response will reduce the data sent by the response body. Or the client simply Expires, but the resource hasn’t changed, so you need to check if the file has changed.
The HTTP 1.1
HTTP 1.1 had major enhancements, and the caching system was formalized, introducing entity tags **e-tag and cache-control.
-
E-tag is the unique identifier of a file or object. Each request is accessed with the e-TAGE parameter to see if the file is updated.
-
Cache-control: Relative expiration, calculated from the time the client receives the response time, the number of seconds after the Cache will expire. The specific fields are as follows:
- “Max-age” : used to set how long resources can be cached, in seconds.
- S-maxage: the same as max-age, but only for proxy caches; – public: indicates that the response can be cached by any cache.
- Private: applies only to individual users and cannot be cached by the proxy server.
- No-cache: Forces the client to send requests directly to the server, that is, each request must be sent to the server. The server receives the request, determines whether the resource has changed, returns the new content if yes, and 304 if not.
- No-store: disables all caching.
-
If-none-match: After the eTAG field is sent to the client for the first time, the client sends an if-none-match to determine whether the data is changed. The value of the if-none-match is the value of the etag
The following figure shows an example of using e-tag in a Web browser.
If last-Modified /ETag is configured, the browser next accesses a resource with the same URI and sends a request to the server asking if the file has been Modified. If Not, the server generates a 304-Not Modified response with an empty body. Browsers cache data directly from the local; If the data changes, send the entire data back to the browser.
summary
Last-modified /ETag doesn’t work the same way as cache-Control /Expires. The first is to ask if the physical version has changed every time, and the second is to directly check that the local cache is still in the valid time range, if so, no request will be sent.
- When used together, cache-Control /Expires takes precedence over last-Modified /ETag.
When a local replica determines whether a cache-Control /Expires Expires Expires, it sends another request to the server asking for last-Modified or e-tag.
Cache-control and Expires both specify the expiration date of the current resource and Control whether the browser can Cache data directly from the browser or re-request data to the server. It’s just that cache-Control has more options, is more elaborate, and takes precedence over Expires when set at the same time.
In general, cache-Control /Expires is used in conjunction with last-Modified /ETag because even if the server sets the Cache time, when the user hits the refresh button, the browser ignores the Cache and continues to send requests to the server. Last-modified /ETag will make good use of the server’s return code 304 to reduce the response overhead.
By adding meta tags to the nodes of an HTML page, it tells the browser that the page is not cached and needs to be pulled from the server every time it is accessed. The code is as follows:
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
Copy the code
However, only some browsers support this usage, and caching proxy servers generally do not, because the proxy itself does not parse the HTML content. The browser cache can greatly improve the experience of end users. When using the browser, users will perform various operations, such as entering an address and pressing Enter or F5 to refresh. The impacts of these actions on the cache are shown in the following figure.
APP side cache
While hybrid programming has become fashionable and controversial, the mobile Internet is still dominated by native apps. Regardless of the size of the APP, the flexible cache not only greatly reduces the pressure on the server, but also will be convenient for users due to a faster user experience. How to make APP cache transparent to business components and timely update APP cache data is the key to the successful application of APP cache.
What is component transparency?
That is, components have no impact on callers and require no maintenance. Out of the box.
Which caches are available for APP caches?
An APP can cache content in memory, a file, or a local database (such as SQLite), but memory-based caches should be used with caution.
Local library operation
The APP uses the database cache:
- After downloading the data file, save the related information of the file (such as URL, path, download time, expiration time, etc.) to the database.
- When the next request to download, according to the URL query from the database, if the query to the current time is not expired, according to the path to read the local file, to achieve the cache effect.
Advantages: The method has the property of flexible file storage, and thus provides great extensibility, which can provide good support for other functions. Disadvantages: If too much information is stored, the storage capacity decreases. So choose the right primary information store for your business
Note the cleansing mechanism for the database cache 。
File operations
For some interfaces in the APP, file caching can be used. This method uses the relevant API of file operation to get the last modification time of the file and judge whether the file is expired with the current time, so as to achieve the caching effect.
However, it is important to note that different types of files have different cache times. For example: file type:
- The content of the image file is relatively constant until it is eventually cleaned up, and the APP can always read the contents of the image in the cache.
- The contents of the configuration file are subject to update and need to be set to an acceptable cache time. At the same time, the standard of cache time is different in different environments.
Network Environment:
-
Under WiFi network environment, the cache time can be set shorter, one is faster network speed, the other is no traffic charges.
-
In a mobile data traffic environment, the cache time can be set to a longer time to save traffic and provide a better user experience.
SDWebImage is a great image caching framework for iOS development. The structure of the main classes is shown below.
SDWebImage is a relatively large class library, providing a UIImageView class to support remote loading of images from the network, with features such as cache management, asynchronous download, control and optimization of the same URL download times. #import”UIImageView+ webCache. h” in the header file to call the asynchronous image loading method:
(void)setImageWithURL:(NSURL *)url placeholderImage:(UIImage *)placeholder options:(SDWebImageOptions)options;
Copy the code
The URL is the address of the image
- An overlay Image is an image that is displayed when the network image has not been successfully loaded
- SDWebImageOptions are related options.
By default, SDWebImage ignores the cache Settings in the Header and saves the image with the URL as the key. The URL and image are mapped one by one. When the APP requests the same URL, SDWebImage retrieves the image from the cache. Update the image by setting the third parameter to SDWebImageRefreshCached, for example:
NSURL *url = [NSURL URLWithString:@"http://www.zhazhahui.com/image.png"];
UIImage *defaultImage = [UIImage imageNamed:@"zhazhahui.png"];
[self.imageView setImageWithURL:url placeholderImage:defaultImage options:SDWebImageRefreshCached];
Copy the code
There are two kinds of caches in SDWebImage
- Disk cache
- Memory cache
Frameworks provide corresponding cleanup methods:
[[[SDWebImageManager sharedManager] imageCache] clearDisk];
[[[SDWebImageManager sharedManager] imageCache] clearMemory];
Copy the code
Note that in iOS7, the caching mechanism has been changed to use the above two methods to clear only the SDWebImage cache, not the system cache, so you can add the following code to the proxy that clears the cache:
[[NSURLCache sharedURLCache] removeAllCachedResponses];
Copy the code
Finally:
- There are three types of caches: system, hardware, and software.
- Software based on the scenario client, network, distributed cache
- User experience ** : the total feeling of a user before, during and after using a product or system.
- System performance indicators include response time, latency time, throughput, number of concurrent users and resource utilization
- Website architecture speech, experience starting, development, medium, master growth, late
- Client cache is divided into page cache, browser cache, and APP cache
That’s the end of this article, but it’s still an appetizer. If helpful, welcome to pay attention to, share.
To obtain the summary outline of this series of brain map directly search “Lotus Child Nezha” on wechat, the background reply “distributed cache” can be free!!
Pour out
Actually, outline had written very early, but to the presentation means of inside content, detail principle does not know how to write ability to achieve anacreontic effect all the time.
The article took me a long time. Ah, dish is dish, za-za-hui I also don’t find excuses… Cha is Cha
I’m Za-za-hui, and I love to analyze advanced knowledge. See you next time. If you find this article helpful, welcome to share + attention. At the same time, I have also sorted out e-books about the improvement of back-end system and knowledge cards about technical problems, and shared them with you. I will continue to update them later. Your attention will be the biggest motivation for me to continue writing. You can search “Lotus Boy Nezha” on wechat.