How do I ensure that data is not lost during writing

When data is written to request to achieve, to need the data format and written to disk process called data submission, the corresponding es is to create inverted index, maintenance segment files If we synchronous way, to deal with the above process, the system throughput would be low If we are in an asynchronous manner, first write memory, then the asynchronous submission to disk, It is possible to lose data that has not been written to disk due to machine failure

To solve this problem, common storage systems design the Transag log or the Write Ahead log. Its function is to log the most recent written data or operations directly to the disk, so that even after a system crash, data recovery can be based on these disk logs.

Mysql has redo undo log, while LSM trees used by HBASE, LevelDB, and RockDB provide write Ahead log to ensure data loss

Why isn’t a straight down disk translog afraid of reducing write throughput?

In the above discussion, synchronization of data to disk may cause performance problems. Why do not directly disk translog and WAL affect performance? Here’s why:

  • Written logs do not need to maintain complex data structures and are only used to record business data that has not actually been committed. So it’s small
  • And in order to write disk, fast

Es to every request synchronization trading translog by default, the configuration index. The translog. Durability for the request. For some scenarios can be lost data, of course, we can be the index. The translog. Durability is configured to async to ascend into the performance of the translog this configuration will write translog asynchronous to disk. How often a disk is written is controlled by index.translog.sync_interval

In order to keep the translog small enough so that the translog cannot scale indefinitely, it is necessary to commit the corresponding real business data to disk with its final data structure (es is the inverted index) after a certain amount of time. This action is called Flush, which actually commits to the underlying Lucene. Flush_threshold_size Specifies the translog age at which flush is triggered using index.translog.flush_threshold_size. After each flush, the original translog is deleted and a new translog is created

Elasticsearch also provides a Flush API to trigger the commit action, but there is no special need to trigger it manually

How do I ensure that written data is not lost in the cluster

Copy mechanism is used for each shard. Ensure that data written to each shard is not lost

in-memory buffer

The aforementioned Translog only ensures data retention and does not maintain complex data structures for the sake of efficient recording. The actual service data was first written to the in-memory buffer. When refresh was called, the data in the buffer was cleared. In turn, the scene was reopen, and the data was visible to the query. But the segment itself is still in memory, so if the system goes down, the data will still be lost. Translog is required for recovery

In fact, this is very similar to LSM tree. The newly written service data is stored in the MemTable (corresponding to the IN-memory buffer of ES) in memory, which corresponds to the writing of hot data. When a certain amount of data is reached and the data structure is maintained, Convert it to an in-memory ImmutableMemTable(the memory segment for es) and it becomes queryable.

conclusion

  • Refresh Is used to convert data written to in-memory buffer into visible segment for query

  • In addition to writing to in-memory buffer, each write also falls to disk translog by default

  • When a certain amount of translog is reached, the in-memory buffer falls and clears itself, an action called flush

  • If the shard currently written to fails, data can be recovered using translog in the disk

This section describes the LSM Tree

www.cnblogs.com/niceshot/p/…

The resources

Ezlippi.com/blog/2018/0… Stackoverflow.com/questions/1… Qbox. IO/blog/refres… www.elastic.co/guide/en/el… www.elastic.co/guide/cn/el…

Welcome to follow my personal account “North by Northwest UP”, recording code life, industry thinking, technology comments