The ceremony is in progress

  • Welfare | SDCC 2017 big data technology of actual combat online summit architecture jun exclusive and fifty percent free coupon codes wayward let out

1. Cache reclamation policy

1.1 Based on space

That is, the storage space of the cache is set to 10MB. When the storage space reaches, data is removed based on a certain policy.

1.2 Capacity Based

Capacity based refers to the maximum size of the cache. When the number of cached items exceeds the maximum size, old data is removed according to a certain policy.

1.3 Based on Time

TTL(Time To Live) : indicates the duration of the cache data from the creation Time in the cache until it expires (whether it is accessed or not within this period, it will expire).

TTI(Time To Idle) : Indicates the Idle period, which is the Time that the cache data is removed from the cache after it has not been accessed.

1.4 Based on Java object references

Soft references: If an object is a soft reference, the garbage collector can reclaim these objects when the JVM heap is out of memory. Soft references are good for caching so that when the JVM heap runs out of memory, they can be reclaimed to make room for strongly referenced objects and avoid OOM.

Weak references: When the garbage collector reclaims memory, if it finds a weak reference, it immediately reclaims it. It has a shorter life cycle than soft references.

Note: A weak/soft reference object is only reclaimed during garbage collection if no other strong reference object references it. That is, if an object (not a weak/soft reference) refers to a weak/soft reference object, then garbage collection does not reclaim that reference object.

1.5 Recycling Algorithm

Using space-based and volume-based caches uses certain policies to remove old data, common ones are as follows:

  • FIFO(Fisrt In Fisrt Out) : First In, first Out algorithm, that is, the first to enter the cache is removed first.

  • LRU(Least Recently Used) : indicates that the algorithm Used Least Recently is removed.

  • LFU(Least Frequently Used) : Indicates the Least Frequently Used algorithm. Data that is Used the Least Frequently within a certain period is removed.

In practice, there are many caches based on LRU. For example, Guava Cache and EhCache support LRU.

2. Java cache type

2.1 the heap cache

Use Java heap memory to store objects. Can use Guava Cache, Ehcache 3.x, MapDB implementation.

  • Advantages: The advantage of using heap cache is that there is no serialization/deserialization and it is the fastest cache;

  • Disadvantages: Obviously, GC pause times are longer when the amount of data cached is large, and storage capacity is limited by the size of the heap; Generally, cache objects are stored by soft/weak references, that is, when the heap memory is insufficient, it can be forcibly reclaimed to release the heap memory space. The heap cache is typically used to store hot data.

2.2 Out-of-heap cache

That is, cached data is stored in off-heap memory. Ehcache 3.x, MapDB can be implemented.

  • Advantages: Reduced GC pause times (fewer objects are scanned and moved by the GC as heap objects move out of the heap), and greater cache space (limited only by the size of the machine memory, not the heap space).

  • Disadvantages: Serialization/deserialization is required when reading data, which can be much slower than heap caching.

2.3 Disk Caching

That is, cached data is stored on disk. The data is still there when the JVM restarts. When the heap/off-heap cache restarts, data is lost and needs to be reloaded. Ehcache 3.x, MapDB can be implemented.

2.4 Distributed Cache

In the case of multiple JVM instances, there are two problems with in-process and disk caching: 1. Single machine capacity problem; 2. Data consistency problem (since data is allowed to be cached, it means that data is allowed to be inconsistent within a certain period of time. Therefore, you can set the expiration time of cached data to update data periodically); 3. When the cache is not hit, more queries need to be returned to DB/ service: each instance will return to DB to load data when the cache is not hit. Therefore, after multiple instances, the overall traffic volume of DB increases. Solutions can be used such as consistent hash sharding algorithm to solve. Therefore, these situations can be addressed using distributed caching. Java interprocess distributed caching can be implemented using EhCache-clustered (in conjunction with Terracotta Server). Of course, you can also use Redis for distributed caching.

The two modes are as follows:

  • In a single machine: The hottest data is stored in the heap cache, the relatively hot data is stored in the out-of-heap cache, and the unhot data is stored in the disk cache.

  • Clustering: The hottest data is stored in the heap cache, the relatively hot data is stored in the out-of-heap cache, and the full data is stored in the distributed cache.

3. Java cache implementation

3.1 the heap cache

3.1.1 Implementation of Guava Cache

Guava Cache provides only heap Cache, is small and flexible, and has the best performance. If only heap Cache is used, it is sufficient.

Cache<String, String> myCache=        CacheBuilder.newBuilder()        .concurrencyLevel(4)        .expireAfterWrite(10, TimeUnit.SECONDS)        .maximumSize(10000)        .build();

You can then read and write to the cache by putting and getIfPresent. The CacheBuilder has several classes of parameters: cache reclamation policy, concurrency Settings, and so on.

3.1.1.1 Cache Reclamation Policy/Based on Capacity

MaximumSize: Set the size of the cache. If the size exceeds maximumSize, the cache is recycled according to LRU.

3.1.1.2 Cache Reclamation Policy/Time-based

  • ExpireAfterWrite: Set TTL. If cached data is not written (created or overwritten) within a given period of time, it will be reclaimed. That is, cached data will be reclaimed periodically.

  • ExpireAfterAccess: Sets TTI to reclaim cached data if it is not read/written within a given period of time. Its TTI is updated each time it is accessed, so that if the cache is very hot data, it will never expire, possibly causing dirty data to exist for a long time (therefore, expireAfterWrite is recommended).

3.1.1.3 Cache reclamation Policy/Based on Java object references

WeakKeys /weakValues: Sets weak reference cache. SoftValues: Sets the soft reference cache.

3.1.1.4 Cache Reclamation Policy/Active Invalidation

Invalidate (Object key)/invalidateAll(Iterablekeys)/invalidateAll() : actively invalidates some cache data.

When does the trigger fail? The Guava Cache does not immediately trigger a cleanUp when the Cache fails (this would require an additional thread to cleanUp the Cache). Instead, the Guava Cache automatically cleans up the Cache when it is PUT. You can also cleanUp the Cache by calling the cleanUp method on your own thread.

3.1.1.5 Concurrency Level

ConcurrencyLevel: Guava Cache overwrites ConcurrentHashMap. ConcurrencyLevel is used to set the number of segments.

3.1.1.6 Statistical hit ratio

RecordStats: Starts recording statistics such as hit ratio

EhCache 3.x implementation

CacheManager cacheManager = CacheManagerBuilder. newCacheManagerBuilder(). build(true); CacheConfigurationBuilder

cacheConfig= CacheConfigurationBuilder.newCacheConfigurationBuilder( String.class, String.class, ResourcePoolsBuilder.newResourcePoolsBuilder() .heap(100, EntryUnit.ENTRIES)) .withDispatcherConcurrency(4) .withExpiry(Expirations.timeToLiveExpiration(Duration.of(10,TimeUnit.SECONDS))); Cache

myCache = cacheManager.createCache(“myCache”,cacheConfig);
,>
,>

CacheManager calls the CacheManager. Close () method when the JVM is shut down. You can read and write the cache by putting and getting. CacheConfigurationBuilder also has several types of parameters: cache recovery strategy, concurrent Settings, such as statistical shooting.

3.1.2.1 Cache Reclamation Policy/Based on Capacity

Heap (100, entryunit.entries) : Sets the number of cached items, and when this number is exceeded, LRU will be used for cache collection.

3.1.2.2 Cache Reclamation Policy/Space-based

Heap (100, memoryUnit.MB) : Sets the memory space of the cache, and when this space is exceeded, the cache is reclaimed as LRU. In addition, you should set withSizeOfMaxObjectGraph(2) : object graph traversal depth when the object size is counted and withSizeOfMaxObjectSize(1, MemoryUnit.kb) : the maximum object size that can be cached.

3.1.2.3 Cache Reclamation Policy/Time-based

WithExpiry (Expirations. TimeToLiveExpiration (Duration of (10, TimeUnit. SECONDS))) : set the TTL, no TTI.

WithExpiry (Expirations. TimeToIdleExpiration (Duration of (10, TimeUnit. SECONDS))) : at the same time set the TTL and TTI, and TTL and TTI values.

3.1.2.4 Cache Reclamation Policy/Active Invalidation

Remove (K key)/ removeAll(Set keys)/clear() : invalidates some cache data.

When does the trigger fail? EhCache uses the same mechanism as Guava Cache.

3.1.2.5 Concurrency Level

EhCache uses ConcurrentHashMap as its internal cache storage. The default concurrency level is 16. WithDispatcherConcurrency is used to set the concurrency level of event distribution.

3.1.3 MapDB 3.x implementation

HTreeMap myCache =       DBMaker.heapDB().concurrencyScale(16).make().hashMap(“myCache”)       .expireMaxSize(10000)       .expireAfterCreate(10, TimeUnit.SECONDS)       .expireAfterUpdate(10,TimeUnit.SECONDS)       .expireAfterGet(10, TimeUnit.SECONDS)       .create();

The cache can then be read or written using PUT and GET. There are several types of parameters: cache reclamation policy, concurrency Settings, statistical hit ratio, and so on.

3.1.3.1 Cache Reclamation Policy/Based on Capacity

ExpireMaxSize: Set the cache capacity. If the value exceeds the expireMaxSize, the cache is reclaimed according to LRU.

3.1.3.2 Cache Reclamation Policy/Time-based

  • ExpireAfterCreate/expireAfterUpdate: set the TTL, cached data in a given period of time didn’t write (create/cover), is recycled. That is, cache data is reclaimed periodically.

  • ExpireAfterGet: Sets TTI to reclaim cached data if it is not read/written within a given period of time. Update it every time they visit TTI, thus if the cache is very hot, will has not expired, may cause the dirty data exist for a long time (as a result, it is suggested to set up expireAfterCreate/expireAfterUpdate).

3.1.3.3 Cache Reclamation Policy/Active Invalidation

  • Remove (Object key) /clear() : invalidates some cached data. When does the trigger fail? MapDB uses a mechanism similar to Guava Cache by default. However, periodic cache invalidation using thread pools is also supported through the following configuration.

  • expireExecutor(scheduledExecutorService)

  • expireExecutorPeriod(3000)

3.1.3.4 Concurrency level

ConcurrencyScale: Similar to Guava Cache configuration.

You can also create a heap cache using dbMaker.memoryDB (), which serializes data and stores it in a 1MB byte[] array to reduce the impact of garbage collection.

3.2 Off-heap cache

3.2.1 EhCache 3.x implementation

CacheConfigurationBuilder<String, String> cacheConfig= CacheConfigurationBuilder.newCacheConfigurationBuilder(       String.class,       String.class,       ResourcePoolsBuilder.newResourcePoolsBuilder()               .offheap(100, MemoryUnit.MB))       .withDispatcherConcurrency(4)       .withExpiry(Expirations.timeToLiveExpiration(Duration.of(10,TimeUnit.SECONDS)))       .withSizeOfMaxObjectGraph(3)       .withSizeOfMaxObjectSize(1, MemoryUnit.KB);

Out-of-heap caches do not support a volume-based cache expiration policy.

3.2.2 MapDB 3.x implementation

HTreeMap myCache =       DBMaker.memoryDirectDB().concurrencyScale(16).make().hashMap(“myCache”)       .expireStoreSize(64 * 1024 * 1024) //指定堆外缓存大小64MB       .expireMaxSize(10000)       .expireAfterCreate(10, TimeUnit.SECONDS)       .expireAfterUpdate(10, TimeUnit.SECONDS)       .expireAfterGet(10, TimeUnit.SECONDS)       .create();

When using the off-heap cache, remember to add a JVM startup parameter, such as -xx :MaxDirectMemorySize=10G.

3.3 Disk Caching

3.3.1 EhCache 3.x implementation

CacheManager CacheManager = CacheManagerBuilder. NewCacheManagerBuilder () // Default thread pool .using(PooledExecutionServiceConfigurationBuilder.newPooledExecutionServiceConfigurationBuilder().defaultPool(“default”, 1, 10). The build ()) / / disk file storage location. With (new CacheManagerPersistenceConfiguration (newFile (” D: \ \ bak “))). The build (true); CacheConfigurationBuilder

cacheConfig= CacheConfigurationBuilder. newCacheConfigurationBuilder( String.class, String.class, ResourcePoolsBuilder.newResourcePoolsBuilder() .disk(100, Memoryunit.mb,true)) withDiskStoreThreadPool(“default”, 5) / / the use of “default” thread pool to dump file to disk. The withExpiry (Expirations. TimeToLiveExpiration (Duration) of (50, TimeUnit. SECONDS))) .withSizeOfMaxObjectGraph(3) .withSizeOfMaxObjectSize(1, MemoryUnit.KB);
,>

When the JVM stops, remember to call Cachemanager.close () to ensure that the memory data is dumped to disk.

3.3.2 MapDB 3.x implementation

DB DB = DBMaker. FileDB (” D: \ \ bak \ \ a. d. ata “)/where/data storage fileMmapEnable () / / enable mmap fileMmapEnableIfSupported () // Enable map.filemmapPrecleardisable () on supported platforms // make mmap files faster. CleanerHackEnable () // Handle some bugs. TransactionEnable () // Enable transactions .closeOnJvmShutdown() .concurrencyScale(16) .make(); HTreeMap myCache = db.hashMap(“myCache”) .expireMaxSize(10000) .expireAfterCreate(10, TimeUnit.SECONDS) .expireAfterUpdate(10, TimeUnit.SECONDS) .expireAfterGet(10, TimeUnit.SECONDS) .createOrOpen();

Because transactions are enabled, MapDB enables WAL. Also, remember to commit the transaction by calling the db.mit method after the cache operation.

myCache.put(“key” + counterWriter,”value” + counterWriter); db.commit();

3.4 Distributed Cache

3.4.1 Ehcache 3.1 + Terracotta Server

Not recommended.

3.4.2 Redis

Performance is very good, there are master – slave mode, cluster mode.

3.5 Multi-level Cache

If the heap cache is searched first, if the disk cache is not searched, MapDB can be implemented through the following configuration.

HTreeMap diskCache = db.hashMap(“myCache”) .expireStoreSize(8 * 1024 * 1024 * 1024) .expireMaxSize(10000) .expireAfterCreate(10, TimeUnit.SECONDS) .expireAfterUpdate(10, TimeUnit.SECONDS) .expireAfterGet(10, TimeUnit.SECONDS) .createOrOpen(); HTreeMap heapCache = db.hashMap(“myCache”) .expireMaxSize(100) .expireAfterCreate(10, TimeUnit.SECONDS) .expireAfterUpdate(10, TimeUnit.SECONDS) .expireAfterGet(10, Timeunit.seconds).expireOverflow(diskCache) // Store to disk.createOrOpen () when cache overflows;

4. Cache usage patterns

There are two main categories: cache-aside and cache-as-SOR (read-through, write-through, write-behind).

  • SoR(System-of-record) : A record system, or data source, that is, the system that actually stores the original data.

  • Cache: Snapshot data of SoR. The access speed of the Cache is faster than that of SoR. The Cache is used to improve the access speed and reduce the number of times that the source is returned to SoR.

  • If the Cache does not match, data needs to be read from THE SoR. This is called the source.

4.1 the Cache – value

Cache-aside means that the business code is written around the Cache, which is maintained by the business code directly. Example code is shown below.

4.4.1 read scene

The data is first obtained from the cache. If there is no match, the source is returned to the SoR and the source data is put into the cache for the next read.

Value = mycache.getifPresent (key); If (value == null) {// if(value == null) {value = loadFromSoR(key); Mycache.put (key, value); mycache.put (key, value); }

4.1.2 write scenarios

Data is first written to SoR, and then synchronized to the cache immediately after successful writing.

//1. Write data to SoRwriteToSoR(key,value); Mycache.put (key, value); //2.

Or, the data is written to SoR first, and the cache data expires after successful writing, and the cache is loaded on the next read.

//1. Write data to SoRwriteToSoR(key,value); Mycache.invalidate (key);

Cache-aside is suitable for implementation using the AOP pattern

4.2 the Cache – As – SoR

Cache-as-sor refers to the Cache As SoR. All operations are performed on the Cache, and the Cache delegates actual reads and writes to the SoR. That is, the business code only sees the Cache operation, but does not see the SoR related code. There are three implementations: read-through, write-through, and write-behind.

2 Read – Through

Read-through: the service code invokes the Cache first. If the Cache fails to match, the source is sent from the Cache to the SoR instead of the service code (that is, the Cache reads the SoR). To use read-through mode, you need to configure a cache reader component to load the source data back and forth to the SoR. Both Guava Cache and Ehcache 3.x support this mode.

4.2.1.1 Implementation of Guava Cache

LoadingCache

> getCache = CacheBuilder.newBuilder() .softValues() .maximumSize(5000).expireAfterWrite(2, TimeUnit.MINUTES) .build(new CacheLoader

>() { @Override public Result

load(final Integer sortId) throwsException { return categoryService.get(sortId); }});

,result
,result

To build a Cache, pass in a Cache ader to load the Cache.

  • Apply the business code directly to getCache.get(sortId).

  • The Cache is first queried, and if there is any in the Cache, the cached data is returned directly.

  • If the cache misses a hit, it delegates to the CacheLoader, which queries the source data back to the SoR (the return value must not be null, but can be wrapped as a NULL object) and writes to the cache.

There are several benefits to using CacheLoader:

  • The application business code is cleaner, and there is no need to Cache the query code and SoR code together like in cache-aside mode. If the cache is scattered with logic, it is easy to eliminate duplicate code in this way.

  • Solve the dog-pile effect, that is, when a cache fails, a large number of the same requests are sent to the cache at the same time, which results in too much pressure on the backend. In this case, only one request can be taken.

4.2.1.2 Ehcache 3.x implementation

CacheManager cacheManager = CacheManagerBuilder. newCacheManagerBuilder(). build(true);org.ehcache.Cache<String, String> myCache =cacheManager. createCache (“myCache”,       CacheConfigurationBuilder.newCacheConfigurationBuilder(String.class,String.class,               ResourcePoolsBuilder.newResourcePoolsBuilder().heap(100,MemoryUnit.MB))               .withDispatcherConcurrency(4)               .withExpiry(Expirations.timeToLiveExpiration(Duration.of(10,TimeUnit.SECONDS)))                .withLoaderWriter(newDefaultCacheLoaderWriter<String, String> () {                   @Override                   public String load(String key) throws Exception {                        return readDB(key);                   }                    @Override                   public Map<String, String> loadAll(Iterable<? extendsString> keys) throws BulkCacheLoadingException, Exception {                        return null;                   }               }));

Ehcache 3.1 did not solve the dog-pile effect by itself.

4.2.2 the Write – Through

In write-through or direct Write mode, the service code invokes the Cache to Write (add or modify) data, and the Cache writes the Cache and SoR, but not the service code.

To use the write-through mode, you need to configure a CacheWriter component to Write SoR back and forth. Guava Cache is not supported. Ehcache 3.x supports this mode.

Ehcache needs to be configured with a cache writer that knows how to write SoR. When the Cache needs to write (add/modify) data, it calls the CacheLoaderWriter first to synchronize (immediately) data to the SoR, and then updates the Cache.

CacheManager cacheManager = CacheManagerBuilder.newCacheManagerBuilder().build(true); Cache

myCache =cacheManager.createCache (“myCache”, CacheConfigurationBuilder.newCacheConfigurationBuilder(String.class,String.class, ResourcePoolsBuilder.newResourcePoolsBuilder().heap(100,MemoryUnit.MB)) .withDispatcherConcurrency(4) .withExpiry(Expirations.timeToLiveExpiration(Duration.of(10,TimeUnit.SECONDS))) .withLoaderWriter(newDefaultCacheLoaderWriter

() { @Override public void write(String key, String value) throws Exception{ //write } @Override public void writeAll(Iterable
> entries) throws BulkCacheWritingException,Exception { for(Object entry: entries) { //batch write } } @Override public void delete(Stringkey) throws Exception { //delete } @Override public void deleteAll(Iterable
keys) throws BulkCacheWritingException, Exception { for(Object key :keys) { //batch delete } } }).build());
,>
,>

Ehcache 3.x is still implemented using CacheLoaderWriter. Through a write (String key, String Value), writeAll(Iterable> entries), Delete (String key), and deleteAll(Iterable keys) support single write, batch write, single delete, and batch delete operations respectively.

Here’s how it works: When we call mycache.put (” e “, “123”) or myCache.putall (map), write to the cache. First, the Cache immediately delegates the write operation to the Cache writer #write and #writeAll, and the Cache writer immediately writes the SoR. After SoR is successfully written, the SoR is written to the Cache.

Holdings Write – Behind

Write-behind is also called write-back. It is called the Write Back mode. Different from write-through, which is synchronous Write SoR and Cache, write-behind is asynchronous Write. After async, batch write, merge write, delay, and traffic limiting can be implemented.

4.2.3.1 asynchronous write

EhCache is used to implement EhCache

4.2.3.2 batch write

EhCache is used to implement EhCache

4.2.4 Copy the Pattern

There are two types of Copy Pattern, copy-on-read and copy-on-write. In both Guava-cache and EhCache the heap Cache is reference-based, so if someone takes cached data and modifs it, unpredictable problems can occur. Guava Cache is not supported, EhCache 3.x is.

public interface Copier

{ T copyForRead(T obj); // copy-on-read, such as mycache.get () T copyForWrite(T obj); // copy-on-write, such as mycache.put ()}

[1] The core technology of web architecture with 100 million traffic. The Zhang Kaitao

Reference: http://blog.csdn.net/foreverling/article/details/78012205

Copyright notice: The content is from the network, and the copyright belongs to the originator. Unless we can not confirm, we will indicate the author and source, if there is any infringement, please inform us, we will immediately delete and apologize. thank you

-END-

Architecture abstract

ID: ArchDigest

Internet application architecture/architecture technology/large websites/big data/machine learning

For more great articles, click below: read the original article