This document describes the Multi-tenant implementation of HBase in three aspects.


Review of previous article:
HDFS short circuit read details

Multi-tenancy Technology, as defined by Wikipedia, discusses how an implementation can share the same system or application in a multi-user environment and still ensure data isolation between users. With the advent of cloud computing, multi-tenancy becomes more important for services on the cloud. Therefore, HBase also provides multi-tenant functions to isolate resources for multiple users sharing the same HBase cluster. This article will be introduced from Namespace&ACL, Quota, RSGroup three aspects.


Namespace&ACL

In HBase, creating a namespace is a lightweight operation. Isolating tables of different services in different namespaces is the simplest method for resource isolation. In addition, common resource isolation methods, such as ACL, quota, and RSGroup, can be set on the namespace.

Acls, Access Control Lists, are used to restrict the operations or Access rights of different users to different resources.

To use an ACL, you need to add the following configuration:

1. Concepts of ACLs



Users are classified into common users and super users. Super users include users who start the HBase service and hbase.superuser users who can manage clusters. Common users can access or operate HBase only after being authorized. Scope can be understood as the granularity of a resource.

Various operation requires the Action of HBase can look up in the official documentation of HBase: http://hbase.apache.org/book.html#appendix_acl_matrix

Combining with the user’s access or operation requirements, the user can set a reasonable action in a reasonable scope, which is the best way to achieve user permission control.

2. Set or cancel permissions

Set or cancel the permission in HBase Shell or by invoking the HBase API. The operations in the shell are shown below:

To set the namespace permission, prefix it with @ :

Set Cell permissions:

3. Storage of permissions

It is stored in hbase: ACL table. The rowkey is calculated by scope. The acl table structure is as follows:



Cell permission is stored using tags of HFile V3.

4. Identification authority

Authentication permission is used to determine whether a user has the permission to perform an operation. This is done in AccessController, a coprocessor that implements MasterObserver, RegionServerObserver, RegionObserver, and so on. Check permissions in hooks of operations such as Master, RegionServer, and Region. Since a full PermissionCache is maintained on each RS, check that the PermissionCache contains the required permissions and throw an AccessDeniedException if the permissions are insufficient.

5. Add/remove permissions

The process for adding/removing grants is shown below:

(1) The client sends a grant or revoke request to the Region server with an ACL region.

(2) After receiving the region server request, the region server puts or deletes the new permission into the ACL table.

(3) AccessController in the hook of region postPut and postDelete. If the operation is on acl region, the AccessController reads the updated permission from acl table and writes it to ZK.

(4) Using the ZK listening mechanism, notify the Master and RegionServer to update the PermissionCache to synchronize permissions between the Master and other RegionServers.

Procedure-based add/delete permissions

To synchronize permissions using Procedure, you need to send grant/ REVOKE requests to the master first. For details, see hbase-21739. In the add/delete permission phase, there are two key steps: one is to record the permission to the ACL table, the other is to synchronize the updated permission to all RegionServers. UpdatePermissionProcedure design to realize this operation, the reference HBASE – 22271 (not yet incorporated into the master branch of community edition). Update the acl table in UpdatePermissionStorage phase, and zk, master PermissionCache, in UpdatePermissionCacheOnRS phase, PermissionCache UpdatePermissionRemoteProcedure, update the RS.

UpdatePermissionProcedure five permissions synchronous case need to be solved:

Grant: Add permission

Revoke: Deletes permissions

Delete Namespace: Deletes all permissions of a Namespace

Delete Table: Deletes all permissions of a Table

Reload: Reobtains all permissions.

In the new scheme, zK is not used to tell RS to update the PermissionCache, but only for the storage of acLs. When RS or Master starts, the ACL table may not be online. In this case, load permission from zK. If the permission in the ACL table is different from that in the ZK, the permission in the ACL table prevails. Therefore, when the master start and acl table online, after launching type to Reload UpdatePermissionProcedure, update the permission on the zk, and updates the PermissionCache RS.

Quota&Throttle

Because there are limits on the resources and services a cluster can provide, Quota limits the amount of data and the speed at which each resource can be accessed.

To enable the HBase quota function, perform the following operations:

The following figure shows HBase concepts about Quota and their relationships:

1, the Throttle Quota

Throttle specifies the number of resources or data to be accessed in a specified period of time.

  • The supported time units include SEC, min, hour, and day.

  • Use REQ to limit the number of requests;

  • Use B, K, M, G, T, P to limit the amount of data requested;

  • CU limits the read/write capacity unit. A read/write capacity unit refers to the read/write volume of a request less than 1KB. If a request reads 2.5 KB of data, three capacity units are required. Can use hbase. Quota. Read. Capacity. The unit or hbase. Quota. Write. Capacity. The unit equipped with a capacity of unit volume of data.

  • Machine scope stands for Throttle configured on a single RS. Cluster indicates that the THROTTLE quota is shared by all RS in the Cluster. If the QuotaScope is not specified, the default is Machine.

The shell command for configuring Throttle is as follows:

You can throttle the RegionServer. The Quota of the RS indicates the service upper limit of the RS. You are advised to set the parameter in seconds.

Set quota of Cluster scope:

How the quota of the Cluster scope is allocated to each RS:

  • TableMachineLimit = ClusterLimit/TotalTableRegionNum * MachineTableRegionNum;

  • For namespace quota, NamespaceMachineLimit = ClusterLimit/RsNum. Note that RSGroup is not considered. The throttle limit assigned to RS is too small, and this calculation method needs to be improved in the future.

GlobalBypass is configured for users globally, bypassing all throttle.

2, Space Quota

The Space is used to limit the data amount of resources. The Space is configured on the namespace or table. When the amount of data reaches the upper limit, the configured violation policies are executed, including:

Disable: Disable table/ the tables of namespace

NoInserts: Disables Mutation operations other than Delete, allowing Compaction

NoWrites: disables the Mutation operation and allows Compaction

NoWritesCompactions: disables Mutation operations and compactions

View the snapshot of the current Space quota (not HBase snapshots), but the size of the current table, the configured limit, and the status of the triggered policy:

Limit the number of tables or regions in a namespace:

hbase.namespace.quota.maxtables/hbase.namespace.quota.maxregionsCopy the code

If the limit is exceeded, a QuotaExceededException is thrown.

Space Quota is implemented by:

(1) the Region of the RS cycle the size information is sent to the master: RegionSizeReportingChoreMaster

(2) the statistics table size and triggered policies are stored in the quota table: Quota observerchorers

(3) Read quota table periodically, execute Policy: SpaceQuotaRefresherChore

3, Soft limit

You can configure throttle limit to soft limit when the cluster resources are insufficient. To enable or disable oversending, run the following command:

Note that oversending allows users to request more user/namespace/table quotas when RS quotas are too large. Therefore, the RS quota must be set before oversending can be enabled. It is recommended to set RS quotas in seconds, because if RS quotas are consumed first by requests from other users, restoring them will take a long time, possibly affecting subsequent requests. Even if these later requests did not exceed their configured User /namespace/table quota.

4, Quota storage

Quota information is stored in the hbase: Quota table.

Row keys have the following types:

N.namespace: indicates the quota of the namespace

T. table: indicates the quota of the table

U.user: User quota

R.ll: Indicates the quota of RegionServer

ExceedThrottleQuota: Whether exceedThrottleQuota is allowed

Quotas related to Throttle are stored in the Q CF, and quotas related to Space are stored in the U CF.

Throttle Specifies whether to enable the ZK node stored in /hbase/ Rpc-throttle. The value can be true or false. Because Throttle is on and off in real time, other quota configurations are delayed by RS periodically reading the quota table.

5, Throttle

Setting throttle is divided into two steps:

(1) The client sends a set quota request to the master. The master stores the quota in the hbase: Quota table.

(2) Every five minutes, RS loads the latest quota value from the quota table and updates the QuotaCache. New set quota, therefore, for up to five minutes to take effect (by hbase. Quota. Refresh. Period configuration time interval).

When the read/write request arrives on RS, the current limiting process is as follows:


The current community code estimates that a get or mutate will consume 100 bytes, and a scan will consume 1000 bytes. This should be optimized to dynamically adjust the estimated number of bytes based on the amount of data read since the last request.

Throttle limit is a time unit that will be restored over time. There are two types of Throttle limit:

Average Interval Refill(default) : The limits within this recovery period are restored according to the current recovery time and the last recovery time, but the maximum value cannot exceed the limit set by the quota.

For example, if 100 resources are configured per second, 10 resources are restored after 100ms. After 2s, 100 resources are recovered instead of 200 resources.

(2) Fixed Interval Refill: All quotas are restored after a Fixed Interval.

For example, if a quota is set with 100 resources per second and the last quota was restored at 10:10:10,100, the next quota will be restored at 10:10:11,100. If the quota is accessed at 10:10:11,099, the quota will still have 0 resources.

Turn current limiting on or off:

If you throttle is disabled, traffic limiting will not be implemented even if quota is enabled for the cluster.

RSGroup

RSGroup allocates RS to different groups, and then allocates a namespace or table to an RSGroup to achieve isolation. It can be interpreted as that each RSGroup forms a small cluster.

To use RSGroup, you need to add the following configuration:

When RSGroup is enabled, all RS are in the default group by default.

After creating a new group, you must first move the RS to the group before you can move the namespace or table to the group.

Add a new RSGroup:

First move RS to this group, then move namespace to this group:

The function of RSGroup is mainly realized in RSGroupAdminEndpoint, which is an Endponit that implements MasterObserver. In the hook of master operation, the region of the table is moved to the corresponding RSGroup.

RSGroup information is stored in hbase: RSGroup table. At the same time, the RSGroup information is also stored in the ZK. When the cluster starts and the RSGroup table is not online, the RSGroup information is read from the ZK.

In summary, this section describes HBase multi-tenant functions. You are encouraged to use them in the production environment and provide suggestions to the community to further improve HBase multi-tenant functions.

About the author

Meyi, the youngest beauty of Xiaomi, is an HBase Committer. The HBase ecological team to which Meyi belongs has a strong technical atmosphere, and has cultivated 9 HBase committers and 2 PMCS, as well as several open source projects under Contributer. Open source enthusiasts and HBase enthusiasts are welcome to join us and grow together.

PS: their team offer oh, poke recruit to | looking for talented engineers for details.

This article was first published on the public account “Miui Cloud Technology”. Click to view the original article.