HDFS based, highly scalable column database, suitable for unstructured data storage. It has the fault tolerance and high throughput of HDFS. It is suitable for scenarios where massive storage, large throughput, and real-time databases are required.

NameSpace

A database similar to a traditional database such as MYsql.

Table

Data tables, which are named the same way as paths.

Row

Each row represents a data object, uniquely identified by a RowKey.

Column

Columns consist of the Column family and Column Qualifier and are split by: (like an object). The HBase storage form can be abstracted as a large object (map in Java). The value of each key is another pair of hashes.

Three Hbase modules

HMaster

Active node in an HBase cluster. It allocates Region (HBase basic unit and minimum storage unit) to RegionServer, coordinates load of RegionServer, and maintains cluster status. Maintains metadata of tables and regions without input/output. There can be multiple HMasters, but only one can be active at a time.

RegionServer

Maintains the Region assigned to the HMaster and processes Region requests. Split a Region that is too large in the running process.

Zookeeper

Coordinate clusters to ensure that there is at least one HMaster in the cluster. Provides the status of RegionServer.

Shell command

Hbase shell # Open hbase Shell help 'status' # Get help, important, you can check all commands status # Check server status version # Check version information list # Check all table desc Create 'Student','homeInfo','Info','... Select * from Student where id = 'homeInfo'; select * from Student where id = 'Info'; TeacherInfo alert 'Student',{NAME => 'teacherInfo',METHOD => Put 'Student',' rowKey1 ','Info:name','Bob' Delete 'Student',' rowKey3 ','Info:name' # delete column get 'Student','rowkey1','Info',' Student','rowkey1','Info 'Student',' rowKey1 ','Info:name' scan 'Student', {COLUMN=>'Info'} # query data for specified COLUMN familyCopy the code