After installing Hadoop, hive, mysql, and hive command line interface according to the online tutorial, I prepared to use JDBC to connect hive for simple query, but found that the above is not enough, need further configuration, the following are some of my records.

HiveServer2

HiveServer2 (HS2) is a server interface that enables remote clients to perform Hive queries and return results. The current Thrift RPC-based implementation is an improved version of HiveServer and supports multi-client concurrency and authentication. After starting the hiveServer2 service, you can connect using JDBC, ODBC, or thrift. Java code JDBC or Beeline Connection JDBC connection. Hue is connected to the Hive service in thrift mode.

When connecting to hive using JDBC, you need to verify the verification mode. You need to configure the verification mode in the hive-site. XML ($HIVE_HOME/conf/hive-site.xml) command:

Here can be set to NONE and the CUSTOM, the former is not need to verify that the latter is a user name and password authentication < property > < name > hive. Server2. Authentication < / name > < value > NONE < / value > <! -- or CUSTOM--> </property>Copy the code

Further configuration is required when setting to CUSTOM:

1, need a custom validation class to implement org. Apache. Hive. Service. The auth. PasswdAuthenticationProvider interface, the custom class package path is org.. Apache hadoop. Hive. Contrib. Auth, $HIVE_HOME/lib: $HIVE_HOME/lib: $HIVE_HOME

Maven needs to import jars:

<! -- https://mvnrepository.com/artifact/org.apache.hive/hive-service --> <dependency> <groupId>org.apache.hive</groupId> < artifactId > hive - service < / artifactId > < version > 2.3.5 < / version > < / dependency > <! -- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common --> <dependency> < groupId > org, apache hadoop < / groupId > < artifactId > hadoop - common < / artifactId > < version > 3.1.2 < / version > < / dependency >Copy the code

Implementation class code:

package org.apache.hadoop.hive.contrib.auth;

import javax.security.sasl.AuthenticationException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hive.conf.HiveConf;
import org.slf4j.Logger;

/ * * *@Author:
 * @Date: the 2019-7-30 9:56 * /
public class CustomPasswdAuthenticator implements org.apache.hive.service.auth.PasswdAuthenticationProvider{

    private Logger LOG = org.slf4j.LoggerFactory.getLogger(CustomPasswdAuthenticator.class);

    private static final String HIVE_JDBC_PASSWD_AUTH_PREFIX="hive.jdbc_passwd.auth.%s";

    private Configuration conf=null;

    @Override
    public void Authenticate(String userName, String passwd)
            throws AuthenticationException {
        LOG.info("user: "+userName+" try login.");
        String passwdConf = getConf().get(String.format(HIVE_JDBC_PASSWD_AUTH_PREFIX, userName));
        if(passwdConf==null){
            String message = "user's ACL configration is not found. user:"+userName;
            LOG.info(message);
            throw new AuthenticationException(message);
        }
        if(! passwd.equals(passwdConf)){ String message ="user name and password is mismatch. user:"+userName;
            throw newAuthenticationException(message); }}public Configuration getConf(a) {
        if(conf==null) {this.conf=new Configuration(new HiveConf());
        }
        return conf;
    }

    public void setConf(Configuration conf) {
        this.conf=conf; }}Copy the code

2. Add the configuration in hive-site. XML

<! - configure the above custom validation implementation class - > < property > < name > hive. Server2. Custom. Authentication. Class < / name > <value>org.apache.hadoop.hive.contrib.auth.CustomPasswdAuthenticator</value> </property> <! Root1 specifies the user name and password. 123456789 --> <property> <name>hive.jdbc_passwd.auth.root1</name> <value>123456789</value> </property>Copy the code

After the configuration, restart HiveServer2 and run the./beeline ($HIVE_HOME/bin) command to test the connection

The hiveServer2 startup mode is $HIVE_HOME/bin/ hiveServer2 or $HIVE_HOME/bin/hive --service hiveServer2Copy the code

Screenshot of successful test:

Here are some of the problems I encountered in my own testing

Impersonate anonymous User root is not allowed to impersonate anonymous when impersonate anonymous is impersonate anonymous

XML ($HADOOP_HOME/etc/hadoop/core-site.xml) file to configure the corresponding Hadoop proxy user. Hadoop. Proxyuser. Root. Hosts the root part of the name of the configuration items for error the User: the User name part of the * < property > < name >. Hadoop proxyuser. Root. Hosts < / name > <value>*</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property>Copy the code

Hadoop needs to be restarted to take effect after configuration. I use Hadoop2.6.5, so I can find stop-all and start-all scripts in the $HADOOP_HOME/sbin directory, execute them, and check whether the startup is successful by JPS. When I run the stop-all script, I find that the Hadoop process under JPS is not shut down, so I kill it. Startup Can start normally.

ErrorCode 500164:Error Initialized or created transport for authentication: Peer indicated failure: Error validating the login

AuthMech=3

0- no password required, 1- KRB authentication, 2- username authentication, 3- username and password authentication

This is the JDBC connection to hive encountered some problems and how to solve, record, convenient later check