Brief Introduction to Caching

Caching – is a typical technology that sacrifices data timeliness for access performance. It is the most widely used technology in medium and large architectures. There are two main types:

  • Static cache: mainly used inWebProvides web site access performance in applications.
  • Dynamic cache: database hotspot data,sessionCaches and dynamic page caches.

Cache Usage Scenarios

Static cache

It’s always aboutWebClass application, willHTML.JS.CSSStatic resources, such as disk or memory, are cached to improve resource response and reduce server stress/overhead. Classic stands for productCDN)

Method 1: Browser cache

Browser caching, also known as client caching, is the most common and straightforward manifestation of static caching.

We often see the following cache configuration in Nginx configuration files:

location ~ .*\.(gif|jpg|jpeg|png|bmp|swf)$ { expires 1d; } location ~ .*\.(js|css)? $ { expires 15d; }Copy the code

You would have seen the following configuration in the previous JSP file:

<meta http-equiv="expires" content="60">
<meta http-equiv="keywords" content="keyword1,keyword2,keyword3">
<meta http-equiv="description" content="This is my page">
Copy the code

The two ways to configure the client cache described above do not require server validation (nginx has a higher priority than the code set), just the expiration of the browser itself, so there is no extra traffic. This approach is well suited to resources that are not subject to constant change.

Method 2: Disk cache

Disk caching is the technique of caching static resource files across disk.

# levels set directory level # keys_zone set cache name and shared memory # inactive is deleted if no one accesses it within a specified period of time, Here is the largest cache space # 1 day max_size proxy_cache_path montos/WWW/default/cache_dir/levels = 1:2 keys_zone = cache_one: 200 m inactive d = 1  max_size30g server { listen 80; server_name _; location /{ proxy_pass http://***.***.***.***:8080; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } location ~ .*\.jsp$ { proxy_cache cache_one; Proxy_cache_valid 200 304 301 302 10d; Proxy_cache_valid any 1d; Proxy_cache_key $host$URI $is_args$args; The value of the # key to hash, define the key proxy_pass http:// * * *. * * *. * * *. * * * : 8080; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } access_log /montos/nginx/logs/default-cache.log; }Copy the code

As you can see from the above configuration, Nginx is mainly implemented through proxy_cache, and can not only implement static file cache, but also implement dynamic file cache.

Method 3: Disk cache

Memory caching is the caching of static files in memory on the server.

Varnish - f default. VCL - s malloc, 2 g - a 0.0.0.0:3600-1024512, 00, 10-80 - w t t * * *. * * *. * * *. * * * : 3500Copy the code

The above is a Varnish launch command. The configuration information is as follows:

  • -a address:port Listening port
  • -f Specifies the configuration file
  • -s Specifies the cache type malloc as memory and file as file
  • -t TTL by default
  • -t address:port Indicates the management port
  • -w Specifies the minimum thread and maximum thread timeout duration

The core configuration of default. VCL is as follows:

sub vcl_fetch{ if(req.request == "GET" && req.url ~ "\.(gif|jpg|jpeg|png|bmp|swf|html)$"){ set obj.ttl = 3600s; }}Copy the code

From the above configuration, we can check the Cache hit from the x-cache in the request header.

Method 4: CDN

CDN already contains static caches represented by the above three types.

As for the introduction of CDN, I believe that you are more or less familiar with it. At present, most companies use OSS to implement static resource storage and caching. Through the OSS resource storage and CDN back source, we can quickly iterate the static resources released by the front end, so that users can obtain the latest resources from the nearest CDN.

The dynamic cache

Dynamic cache Is used to cache dynamic data in services.

Scenario 1: Database caching

The database cache puts some of the data on disk into the cache so that it can be retrieved directly from the cache for the next access.

The database cache has the following technical characteristics:

  • Superior performance The main purpose of caching is to improve performance. Reading data from the cache has an advantage over reading data from disk.
  • Application Scenario In daily services, most of the traffic pressure of the database comes from the query. Adding, deleting, and modifying operations occupy a small portion of the traffic. Therefore, the cache is used to reduce the pressure of the database more effectively.
  • Data consistency In exchange for high access to services at the expense of space, data inconsistency will occur, and when this problem occurs, we need to think of a good solution to maintain the data consistency of the two. Data that is in real time should not be read from the cache.
  • High availability also needs to meet the high availability of the cache to avoid the occurrence of cache collapse, which will cause huge pressure to the database in a certain period of time.

Scenario 2: FocusSessionmanagement

In distributed architecture, file storage can be provided byOSS,FastDfsWhile session technology is more inclined to load balancing

It is mainly accomplished by the following schemes:

  1. Based on the sourceIPSession persistence

    Load balancing identifies the source of the client requestIPAddress, forwarding the same address to the same backend server for processing.
  2. Browser-basedCookieSession persistence

    Populate the session with the clientCookie, and then load balancer selects the back-end corresponding server based on this.
  3. Database based storageSession

    The unified saving of user login sessions in the database can solve the problem of inconsistent sessions in distributed architecture, but brings pressure on the database.
  4. Based on dynamic cache storageSession

    Unified storage of sessions through cache can solve the problem of inconsistent sessions and solve the pressure of the database mentioned above.
  5. Based on theTomcatThe clusterSessionShared

    By configuringTomcatClusters can also achieve this effect, but only for the current cluster.
  6. Based on theNASFile sharing

    File-sharing-based systems are also able to achieve session unification, but also add disk I/O issues.

Scenario 3: Dynamic page caching

Half of dynamic pages involve dynamic calculation, database cache, database operation, etc., so high requirements for data timeliness are not suitable for use.

  • throughNginxconfiguration

    Through the built-inProxyThe moduleproxy_cacheThe implementation.

    Through the built-inMEMcacheModule implementation.

    Through third-party modulesmemc-nginxandsrcache-nginxBuild a highly transparent caching mechanism.

The three methods mentioned above are implemented in the form of Nginx plug-ins, as dynamic page caching is very rare in the current project, most of which are based on dynamic data caching.

Scenario 4: Distributed lock

In large business scenarios, to solve the problem of single data being manipulated, a distributed lock is introduced to control the solution. Only hereRedisImplementation)

Distributed locks are mainly implemented through Redis in the following categories:

  • Based on theRedisSince the implementation
  • RedLock
  • Redission

Cache summary

Caching is a technology that sacrifices data timeliness for access performance, but also increases space cost, data maintenance cost, etc. How to better use the cache should be measured according to its own business characteristics.

🏆 technology project issue 8 chat | magical function and problems of cache