Source of problem

Currently, the enterprise operates and maintains two versions of HBase, which correspond to community version 94 and 1.1 respectively. Since 2016, the promotion of the new version has been started. The new version has better functions and performance than 94, and the future direction is also on the new version. Therefore, some businesses need to upgrade from 94 to the new version.

Comparison of version upgrade schemes

There are two versions of upgrade: in-place upgrade and migration data upgrade. In-place upgrade is to upgrade from version 94 to version 1.1 directly in the original cluster, and another way is to migrate data from version 94 to the new version.

  • In situ upgrade

To upgrade from 94 in place to 1.1, HDFS and HBase need to be upgraded. However, due to the early version of 94, rolling upgrade cannot be performed when upgrading from 94 to 96. In other words, services cannot be read and written during the upgrade. So when is it appropriate to upgrade in situ? In fact, before double 11, we used to upgrade the recommended search cluster by upgrading in situ. Recommendation search is special. In addition to the active and standby clusters, the entire service can control data push and switch the read and write traffic at any time. Therefore, our task is to upgrade one cluster after switching traffic, and then upgrade the other cluster after completion. The in-place upgrade is easiest if the business can stop reading and writing. However, in most cases, services cannot stop reading or writing for a long time.

  • Why do I need to synchronize data across versions in real time

At present, in version 94, several services share a cluster, and there is no active/standby. What’s more, some services require upgrading, while others do not, so data migration is the only solution. However, migrating data is not a simple copy of data. In addition to cross-version, it also involves migrating original data and adding new data in real time. Finding a solution to cross-release migration, including the migration of the original data in real time, is the key to solving the whole problem.

  • Start with the same version of data migration

When migrating data for the same version, we used Snapshot + Replication, and the client only needed to choose the time to switch. The roadmap is as follows: Synchronize data between the two clusters. At a certain point, pause the synchronization, create snapshot, migrate data, and start Replication to synchronize the backlog data. However, there is no way to make replication between old and new versions. How do you solve this problem? Those who have worked with HBase know the principle of Replication. The primary cluster sends logs to the secondary cluster, and the secondary cluster parses logs and writes them to the primary cluster. So we just need to find a way to change to the new version, here’s my solution (bad idea):

Real-time cross-version replication data

Here, we still need a backup cluster, but the backup cluster itself does not write data to itself, only as a bridge. We changed to write to the new version of REST service through HTTP service, and write to the new version cluster directly by REST.

The code involved here is also relatively simple, mainly involving one class – ReplicationSink:

When ReplicationSink is initialized, initialize rest parameters. Private void initRest() throws IOException {//host:port; host:port String hostsConf = conf.get("rest.host.list");

        if (StringUtils.isEmpty(hostsConf)) {

            throw new IOException("No new version of REST service is configured");

        }

        LOG.info("rest.host.list Conf:" + hostsConf);

        String[] nodeArray = hostsConf.split(";");

        if (null == nodeArray || nodeArray.length == 0) {

            throw new IOException("Rest.host. list configuration exception");

        }

        this.cluster = new Cluster();

        for(String node : nodeArray) { cluster.add(node); Batch method: try {table = new RemoteHTable(client, bytes.toString (tableName)); List<Put> putList = new ArrayList<Put>(); List<Delete> deleteList = new ArrayList<Delete>();for (List<Row> rows : allRows) {

        for (Row row : rows) {

            LOG.info(row.toString());

            if (row instanceof Put) {

                putList.add((Put) row);

            } else if (row instanceof Delete) {

                deleteList.add((Delete) row);

            }

        }

        table.put(putList);

        table.delete(deleteList);

        this.metrics.appliedOpsRate.inc(rows.size());

    }  

Copy the code

Original data Migration

Currently I have tried, but without changing the code, there is no way to migrate from the old version to the new version through snapshot on the HBase level, so I have to start from the HDFS level. Since the data is written in real time, we just need to synchronize the data, temporarily flush the data at a certain point, and then use “brute force migration” :

Create snapshots on the HDFS layer


hdfs dfsadmin -allowSnapshot /hbase/HB_RT_WIRELESS_SAFE

hdfs dfs -createSnapshot /hbase/HB_RT_WIRELESS_SAFE  HB_RT_WIRELESS_SAFE_Snapshot

Copy the code

The snapshot location in the source cluster is


/hbase/HB_RT_WIRELESS_SAFE/.snapshot/HB_RT_WIRELESS_SAFE_Snapshot

Copy the code

SCP replication data

hadoop distcp -update -i -delete -skipcrccheck -m 50 -strategy dynamic H <a>ftp:// Old cluster :50070/hbase/HB_RT_WIRELESS_SAFE/. Snapshot /HB_RT_WIRELESS_SAFE_Snapshot</a> HDFS: / / new cluster: 8020 / hbase/data/default/HB_RT_WIRELESS_SAFECopy the code

Restore data, because in the new version of the table information file is.tabledesc, so you need to manually do it first


hdfs dfs -rm -r  /hbase/data/default/HB_RT_WIRELESS_SAFE/.tableinfo.0000000001

hbase hbck -fixTableOrphans  "HB_RT_WIRELESS_SAFE"

Copy the code

Repair data


hbase hbck -fixMeta "HB_RT_WIRELESS_SAFE"

hbase hbck -fix "HB_RT_WIRELESS_SAFE"

Copy the code

conclusion

Instead of coordinating the slow migration of services, in this way, after the DBA has migrated the data, the services can switch the client after verifying the data at their own choice. Of course, there are other ways, such as Export and Import to run Mr.

If you have a better way, or if this way is not thoughtful, welcome to discuss.