Basic Understanding:
HBase: HBase is a distributed, column-oriented open source database. This technology is derived from The Google paper “Bigtable: A Distributed Storage System for Structured Data” written by Fay Chang. Just as Bigtable leveragesthe distributed data store provided by Google’s File System, HBase provides capabilities similar to Bigtable on top of Hadoop. HBase is a subproject of the Apache Hadoop project. Unlike common relational databases, HBase is a database suitable for unstructured data storage. Another difference is that HBase is column-based rather than row-based.
Structure is introduced
HBase – Hadoop Database is a high reliability, high performance, column-oriented, and scalable distributed storage system. The HBase technology can be used to build large-scale structured storage clusters on low-cost PC servers.
Different from commercial big data products such as FUJITSU Cliq, HBase is an open source implementation of Google Bigtable. Google Bigtable uses GFS as its file storage system, while HBase uses HDFS as its file storage system. Google uses MapReduce to process massive data in Bigtable, and HBase uses Hadoop MapReduce to process massive data in HBase. Google Bigtable uses Chubby as a collaborative service, and HBase uses Zookeeper as a corresponding service. [1]
The figure above depicts the various layers of systems in the Hadoop EcoSystem. HBase resides at the structured storage layer. Hadoop HDFS provides high-reliability lower-layer storage for HBase, Hadoop MapReduce provides high-performance computing capabilities for HBase, and Zookeeper provides stable services and failover mechanisms for HBase.
Pig and Hive also provide high-level language support for HBase, making statistical processing on HBase very easy. Sqoop provides a convenient RDBMS data import function for HBase, making it easy to migrate data from traditional databases to HBase.
Teaching course: HBase version of the cloud database
The syllabus
Chapter 1: HBase Principles (6 periods)
Chapter 2: HBase Pseudo-distribution and Commands (4 periods)
Chapter 3: HBase Fully Distributed Construction (2 periods)
Chapter 4: HBase Code (8 periods)
Chapter 5: HBase Table Design (8 periods)
Chapter 6: HBase Protobuf (4 periods)
Chapter 7: HBase Optimization (3 periods)
Chapter 8: HBase MapReduce (4 periods)
Official website of Ali Yun University (Official website of Ali Yun University, Innovative Talent Workshop under cloud Ecology)