This is the fourth day of my August Challenge. For details, see SmartDedupe and SmartCompression. The smartDedupe and SmartCompression functions provide data Thin provisioning services for file systems and Thin luns. IT can save space for customers and reduce the Total Cost Ownership (TCO) of enterprise IT architecture. Next, take a look at deduplication and compression at……

Online Deduplication (SmartDedupe)

Online deduplication The online deduplication function is implemented for file systems and Thin luns. In the system, the granularity of the deduplication function is the same as the Grain of the file system or the minimum read/write unit of the ThinLUN. In addition, when creating a file system or ThinLUN, you can specify the Grain size (from 4KB to 64KB) to implement deduplication based on different Grain sizes.

The process for deduplication is shown in the following figure.

  1. The storage system is split by block size.
  2. The storage system compares the newly written data block with the old data block through the fingerprint database. If the fingerprint is different, the new data block is written. If the fingerprint is different: − Byte by byte comparison is disabled (default), the storage system directly points old data blocks to the storage location of newly written data blocks without allocating space. − When the byte by byte comparison function is enabled, the data previously written is compared with the current data at the byte level. If the data is identical, it is considered as duplicate data blocks. If not, consider it a new data block.

For example, the original data blocks in the file system are data blocks A and B. The application server writes data blocks C and D. The fingerprint of data blocks C is consistent with that of data blocks B, but the fingerprint of data blocks D is inconsistent with that of original data blocks A and B. When different deduplication policies are adopted, the data deduplication results are shown in the following figure.

Online Compression (SmartCompression)

The common compression practices in the industry are online compression and post-compression. The storage system implements online compression. The newly written data is compressed before being written to the disk, and then the compressed data is written to the disk, which effectively saves user space. Online compression has the following advantages over post-compression (compression is performed after data is unloaded) : Smaller initial storage space, reducing the initial investment of customers. Fewer I/ OS, suitable for SSDS with read/write life limits. In online compression, snapshots are created after compression to save maximum storage space. The storage system compresses data to different degrees according to user-defined compression policies. The storage system supports the following compression policies: Fast policy: The Fast policy is the default compression algorithm. The algorithm has a fast compression speed, but is less efficient than Deep strategy in space saving. Slim Deep policy: Deep policies offer a noticeable improvement in space-saving efficiency, but compression and decompression take longer. The data compression process is shown in the figure.

Deduplication and compression can be superimposed

SmartDedupe and SmartCompression can be enabled at the same time. When this function is enabled at the same time, data is overdeleted and then compressed. In this way, the reduction effect is superimposed to save more storage space.

Thank you, thank you ~