HBase architecture

Region-Store-ColumnFmilyThe relationship between

Logical layer: An HRegion consists of one or more Stores

Table (HBase table) Region (Regions for the table) Store (Store per ColumnFamily for each Region for the table) MemStore  (MemStore for each Store for each Region for the table) StoreFile (StoreFiles for each Store for each Region for the table) Block (Blocks within a StoreFile within a Store for each Region for the table)Copy the code

Physical layer: Each store stores one Column family

First, write operation

1, Client write -> save to MemStore, until MemStore full -> Flush into a StoreFile

2. The number of storeFiles increases to a certain threshold -> Trigger Compact merge operation -> Merge multiple Storefiles into one StoreFile, and perform version merge and data deletion at the same time

3. After StoreFiles Compact, larger and larger StoreFiles are gradually formed -> When the size of a single StoreFile exceeds a certain threshold, Split operation is triggered to Split the current Region into two regions. The original Region goes offline, and the two newly Split sub-regions are allocated by HMaster to the corresponding HRegionServer (load balancing). In this way, the pressure of the original Region is transferred to the two regions

HBase only adds data, updates, and deletes data in the Compact phase. Therefore, write operations are returned immediately after being transferred to the memory, ensuring I/O performance.

Second, read operation

Client > ZooKeeper ->.ROOT->.META-> The user data table is recorded by ZooKeeper. The path information of ROOT (ROOT has only one region) is recorded in.root. META region information, (.meta may have multiple regions)..meta records region information.

In HBase, all storage files are divided into small storage blocks that are loaded into memory during GET or SCAN operations, similar to the storage unit pages in an RDBMS. The default size of this parameter is 64K. Void setBlocksize(int s); (Note: Hfile size in HBase is 64K by default and HDFS block size is 64M)

When HBase reads a data block to the memory cache in sequence, the data block can be read from memory instead of disk, effectively reducing the number of disk I/ OS.

void setBlockCacheEnabled(boolean blockCacheEnable); This parameter defaults to TRUE, which means that each block read is cached in memory.

However, if the user reads a particular column family sequentially, it is a good idea to set this property to FALSE to disallow cache speed.

The reason described above is that if we access a particular column family, but we still enable this function, then our mechanism will load the data of the other column family that we do not need into memory, adding to our burden. We use the condition that we get adjacent data.

Three, optimize

When we have a large amount of data to insert, the Put instance will be sent to the region server one by one if we do not disable the function. If the user disables the Put function, the Put operation will be sent to the region server only when the write buffer is full.

2. Use scan cache If HBase is used as the input source for a MapReduce job, it is better to set the cache as the mapReduce job input scanner instance to a larger number than the default value 1 using the setCaching() method. Using the default means that the Map task will request the Region server for each record it processes. However, a value of 500 will allow 500 data to be sent to the client for processing at a time, depending on your situation. This is row level.

For example, if we are dealing with a large number of rows (especially as an input source for MapReduce), we have the scan.addfamily () method for scan. If we only need to go to a few columns in the column family, we must be precise. Because too many columns result in a loss of efficiency.

Of course this does not improve our efficiency, but if it is left open it will affect it.

First, we start our block cache with scanc.setCachebolcks (). We should use blocks for frequently accessed rows, but mapReduce jobs Scan a lot of rows, so we shouldn’t.

Optimize the way we get rowkeys for the table. We can use this if we only need rowkeys for the table.

That’s what the book says, but I don’t think it’s a good idea to use it, because when we disable it, the server doesn’t write Put to WAL, it writes directly to memStore, and if the server fails, we lose our data. Compression hbase supports a large number of algorithms and compression algorithms above the column family level. Unless there is a specific reason to use compression, compression usually results in better performance. After some testing, we recommend SNAPPY for our hbase compression

Four,HLogThe function of the

In a distributed system environment, system errors or downtime cannot be avoided. Once HRegionServer exits unexpectedly, memory data in MemStore will be lost. HLog is introduced to prevent this situation.

Working mechanism: Each HRegionServer has an HLog object. HLog is a class that implements Write Ahead Log. Each time the user writes Memstore, a copy of data is also written to the HLog file. And delete the old files (data that has been persisted into StoreFile). When HRegionServer stops unexpectedly, HMaster detects the remaining HLog files through Zookeeper. HMaster splits the log files of different regions and stores them in corresponding region directories. Then, the HRegionServer redistributes the invalid regions (with newly split logs). During the Load of the region, the HRegionServer of these regions finds that historical Hlogs need to be processed. Therefore, the HLog data is replayed to the MemStore. Then flush to StoreFiles to complete data recovery.

Five,HbaseStorage architecture andRowkeyThinking on design

RegionisStoreFiles.StoreFilesIn theHFileComposition,HFileIn thehbasethedataBlock composition, there are many in a data blockkeyvalueYes, eachkeyvalueIt has all the values we need.

From the figure above, we can see that a table has two column families (one red, one yellow) and a column family has two columns

As can be seen from the figure, this is the column database’s biggest characteristic: the same column family of data together, we also found that if there are multiple versions. Finally, we find that there are values r1: rowkey, CF1 :column Family, C1 :qualiter, T1: versionId, and value (the last figure shows where value can be stored). R1: RowKey; CF1 :column Family; c1:qualiter;

In the penultimate diagram, the efficiency of field filtering decreases significantly from left to right, so users can consider moving some important filtering information left to the appropriate position when designing keyValue to improve query performance without changing the data quantity. The simple answer is that users should try to store query dimensions or information in rows because it is the most efficient way to filter data.

After getting the above understanding, we should also have such a realization:

HBase data is stored in a specific Region in sequence (divided by Rowkey). Therefore, HBase data is stored in the same Region. A Region can only be managed by a RegionServer. As a result, the cluster performance deteriorates. Solution: The design of the Rowkey

  • Rowkey hash:Let’s say we have nine servers, so we go back to the current time, andMod 9 or the other way aroundThat added to theRowkeyPrefix, so that the data is evenly distributed to different region servers. The advantage of this is that the user can read the data in parallel by multiple threads, thus improving the query throughput.

Write cache

Small data operations: Each PUT operation is actually an RPC operation that transfers client data to and from the server.

Large data operations: If you have an application that needs to store thousands of rows of data per second to HBase tables, PUT processing is not appropriate. The HBase API is configured with a client write buffer.

  • The bufferResponsible for collectingputOperation, and then callRPCoperationA one-timewillput listSend to server. By default, client buffers are disabled. Can be set to by auto brushFALSETo activate the buffer.
// You can activate the buffer by setting brush to FALSE to disable brush
table.setAutoFlush(false);      
// This method forces the data to be written to the server
void flushCommits (a) throws IOException
// The user can also configure the size of the client write buffer as follows. The default size is 2MB
void setWritaeBufferSize(long writeBufferSize) throws IOException;
Copy the code
  • The default size of the write buffer is 2MB. This is also moderate. The average user does not insert large amounts of data, but you may want to consider increasing this value if you insert large amounts of data. This allows the client to more efficiently group a certain amount of data to be executed through a single RPC request.

  • It is also troublesome to set a write buffer for each user’s HTable. To avoid this, you can set a large default value in ** hbase-site.xml **.

<property>

	<name>hbase.client.write.buffer</name>

	<value>20971520</value>

</property>
Copy the code

Follow my public number [Baoge Big Data], more dry goods