Friends, if you follow us, you know that There are many events in openLooKeng recently, including summer 2021, xincreate Competition, Internet + competition and so on. Based on your questions on the preparation of the contest, the little assistant, together with the teachers in the community, prepared some relevant technical articles for you, hoping to be helpful to you. If you have something you want to talk about, feel free to post an Issue in the community code repository.

https://gitee.com/openlookeng community code storehouse

Welcome to openLookeng. IO

The current theme

* * * * 01Software installation

Note: Manually deploy the JDK in advance

1.1 Using scripts for Rapid Deployment Clusters

Download the installation script: gitee.com/openlookeng… Script Installer directory:

You only need to configure the config. TXT file during deployment

The modifications are as follows:

After modifying config. TXT, run main.sh to deploy the cluster

Sh After the main.sh command is executed, OpenLookeng is deployed in the following paths:

The following files exist in the installation path:

**02  **Client use

2.1 use the CLI

Java -jar /opt/hetu-install /etc/hetu-cli-316-executor. jar –server IP :port –catalog name IP: IP address of the service node Port: port number of the service. The default value is 8090 CATALOGNAME: data source to be accessed, corresponding to the file name in the /etc/hetu/catalog/ directory other parameters: –schema: –user: specifies the user accessing the data source, followed by the user name. –execute: specifies the SQL to be executed, followed by the SQL statement (SQL must be enclosed in double quotation marks followed by semicolon). -f: specifies the SQL file, followed by the SQL file name. This parameter is used to execute SQL without entering the client. The same as -e.

2.2 the JDBC use

1. Obtain the JDBC JAR package hetu-JDBC-316. jar and store it in the desired directory. 2. The tool using the JDBC connection, the URL is: JDBC: hetu: / / IP: 8090, driver class: IO. Hetu. Core. JDBC. HetuDriver, including IP service for the hetu IP.

2.3 UDF use

2.3.1 CBG UDF Integration

1. Upload the udF function registration file udf.properties to the /etc/hetu/ directory. The file format is function_name class_path, as follows: booleanudf io.hetu.core.hive.dynamicfunctions.examples.udf.BooleanUDF shortudf io.hetu.core.hive.dynamicfunctions.examples.udf.ShortUDF byteudf io.hetu.core.hive.dynamicfunctions.examples.udf.ByteUDF Intudf IO. Hetu. Core. Hive. Dynamicfunctions. Examples. The udf. 2 intudf node. Data – dir created under externalFunctions folder, Upload udF functions and dependent classes to this folder. 3. Upload the configuration file dependent on the UDF function to {node.data-dir} to create the externalFunctions folder and upload the UDF function and dependent classes to this folder. 3. Upload the configuration file dependent on the UDF function to the node.data−dir directory, create the externalFunctions folder, and upload the UDF function and dependent classes to this folder. 3. Upload the configuration file dependent on the UDF function to {node.data-dir}. 4. Perform the preceding steps for each node of the openLooKeng service and restart the service.

2.3.2 INTEGRATION of UDAF

1. Create the externalFunctionsPlugin folder under ${node.data-dir}. 2. Decompress the UDAF zip package such as cbg-hive-functions-1.0-snapshot. zip to externalFunctionsPlugin. 3. Perform the preceding steps for each node of the openLooKeng service and restart the service

2.4 Using the SQL Migration Tool

1. The interactive mode is Java -jar hetu-sql-migration-tool-316.jar

Java -jar hetu-sql-migration-tool-316.jar –execute “Hive-sql to be converted”

3. File mode: Java -jar hetu-sql-migration-tool-316.jar –file Hive SQL file to be converted –output Directory of the converted file

4. Run the converted SQL on Hetu. 2.5 Web UI Use Configure the following parameters in config.properties of a coordinator node:

03 Data Source Configuration

3.1 Configuration File Adding a data source

3.1.1 Configuring FI Hive Data Sources

You can perform the following configuration methods: ** Method 1: ** Automatic configuration before service installation 1. Download the FI cluster configuration file and user credentials from the FI cluster. Configuration file: Log in to the FI cluster page, click More, and choose Download Client. In the dialog box, select Profile Only and click OK. Unzip the download.

User credentials: Log in to the FI page, select the user on the “System” page, click “More”, and click “Download Authentication Credentials”. Click ok in the box. After downloading, unzip the file.

2. Upload HDFS files core-site. XML, hdFS-site. XML and user authentication credentials user.keytab, krb5.conf, and hosts obtained in Step 1 to client_dependencies in the directory where the heTU installation package is decompressed.

3. Modify the decompression client_dependencies/clientMainConfig files in the directory of the configuration items: KRB_PRINCIPAL: the default value is modified to the corresponding user name authentication credentials; HIVE_METASTORE_URI: Change the default value to the hive.metastore.uris value in the hive configuration file hive-site. XML obtained in Step 1. When the install.sh script is executed to install the service, the FI Hive data source is automatically configured on each node of the service.

1. Obtain the FI Hive configuration file and user authentication credentials. For details, see Step 1 in Method 1.

2. Upload HDFS files core-site. XML and hdFS-site. XML and user authentication credentials user.keytab and krb5.conf obtained in Step 1 to each node of the service.

3. Create a hive.properties file in /etc/hetu/catalog and write the following content:

4. Add the hosts file obtained in Step 1 to the /ets/hosts file.

5. Perform Steps 2 to 4 on each node and restart the service

3.1.2 Configuring an Open Hive Data Source

1. After HetuServer is deployed, go to the /etc/hetu/catalog directory and edit the hive.properties file

2. Restart the HeTU service.

3.1.3 Configuring the DC Connector

1. Create a dc.properties file in the /etc/hetu/catalog directory on each node of the Hetu service and write the following content:

2. Restart the HeTU service.

3.1.4 Configuring Hbase Data Sources

1. Upload the Hadoop service configuration files core-site. XML and hdFS-site. XML of the Hbase service to each hetu node.

2. Create the hbase.properties file in the /etc/hetu/catalog directory and write the following content:

3. Perform Steps 1 and 2 to restart the HETU service on each node.

3.1.5 Configuring an Oracle Data source

1. On each node of the Hetu service, create an oracle.properties file in the /etc/hetu/catalog directory and write the following contents:

2. Restart the HeTU service.

3.1.6 Configuring the MySQL data source

1. Create a mysql.properties file in the /etc/hetu/catalog directory of each node and write the following contents:

2. Restart the HeTU service.

3.1.7 Configuring Hana Data Sources

1. Create a hana.properties file in the /etc/hetu/catalog directory on each node of the Hetu service and write the following information:

2. Restart the HeTU service.

3.1.8 Configuring VDM Connector

1. Create a hetu-metastore.properties file in the /etc/hetu/ directory on each node of the Hetu service and write the following content:

2. Create a vdm.properties file in the /etc/hetu/catalog directory on each node of the Hetu service and write the following contents:

3. Restart the HeTU service.

3.1.9 Configuring the Carbondata Connector

1. Add mapred-site, yarn-site. XML, core-site. XML, and hdFs-site. XML to /opt/ Hetu /conf for each node of the Hetu service. 2. Add carbondata.properties file to /etc/hetu/catalog of each node and write the following contents:

3. Restart the HeTU service.

3.2 Dynamically adding data sources

3.2.1 Dynamic Directory Configuration

1. Add the following content to node.properties of all nodes:

2. Add the following information to config.properties for all nodes:

3. Configure local-config-catalog.properties by referring to 4.4.2

3.2.2 Configuring FI Hive Data Sources

1. Curl command add:

The path of the four credentials must be correct on the local computer. The parameters of this mode are the same as those of the local computer. The user credentials must be stored on the local computer to send the request successfully.

2. The Postman

3.2.3 Configuring an Open Hive Data Source

1. Curl command add:

XML and core-site. XML files are required to add open source Hive using Postman. The files must be saved locally and the local path is correct

2. The Postman

3.2.4 Configuring the DC Connector

1. Curl command add:

2. The Postman

3.2.5 Configuring Hbase Data Sources

1. Curl command add:

2. The Postman

3.2.6 Configuring an Oracle Data source

1. Curl command add:

2. The Postman

3.2.7 Configuring the Mysql data source

1. Curl command add:

2. The Postman

3.2.8 Configuring Hana Data Sources

1. Curl command add:

2. The Postman

04 Feature Parameter Configuration

4.1 Setting state-store Parameters

1. Create state-store.properties in the /etc/hetu/ directory of all Hetu nodes and configure the properties using the following template: 2. Restart the HeTU service.

Hazelcast.discovery.tcp-ip. seeds or hazelcast.discovery.tcp-ip.profile Hazelcast.discovery.tcp-ip. seeds takes effect. Hazelcast.discovery.tcp-ip. seeds is recommended to have two or more addresses.

4.2 Setting Hetu MetaStore Parameters

1. Create hetu-metastore.properties in the /etc/hetu/ directory of all Hetu nodes and configure the hetu-metastore.properties based on the following template:

2. Restart the HeTU service.

4.3 Configuring the Hetu Filesystem Parameter

Create filesystem in the /etc/hetu directory

4.3.1 HdFS-config-default. properties Parameter Configuration

1. Create the hdfs-config-default.properties file in the /etc/hetu/filesystem directory and configure the file based on the following template:

2. If HDFS is disabled:

3. Local file system configuration:

4.4 SETTING AA Parameters

1. Configure state-store.properties by referring to 4.1.

2. Configure hdFS-config-default. properties by referring to 4.3.2.

3. Add the following information to the config.properties file of the Coordinator node:

4. Add the following content to the config.properties file of the worker node:

5. Restart the HeTU service.

4.5 Setting Global Dynamic Filter parameters

1. Configure state-store.properties by referring to 4.1. If yes, skip this step.

2. Configure hdFS-config-default. properties by referring to 4.3.2. If yes, skip this step.

3. Add the following content to the config.properties file of each node:

4.6 Execution Plan Cache Parameters Configuration

1. Add the following configuration to config.properties of the Hetu node, which is enabled by default.

2. Restart the HeTU service.

4.7 Star Tree Parameter Settings

1. Configure hetu-metastore.properties by referring to 4.2. If yes, skip this step.

2. Add the following content to the config.properties file of each node:

Session configuration: Set Session enable_STAR_tree_index =true

4.8 Task Recovery parameter configuration

1. Configure hdFS-config-default. properties by referring to 4.3.2. If yes, skip this step.

2. Add the following content to the config.properties file of each node:

Only session Settings are supported: Set session snapshot_enabled=true

4.9 Reuse Exchange Parameter configuration

1. Add the following parameters to config.properties of all Hetu nodes:

4.10 Set Common Table Expression (CTE) parameters

1. Add the following configuration to config.properties of all Hetu nodes:

2. Set Session cTE_reuse_enabled =true

4.11 Hindex parameter configuration

Add the following configuration to /etc/config.properties:

Create the index-store.properties file under etc/filesystem and write the following configuration:

4.12 Pushdown Framework parameter configuration — HanXU

Pushdown of DC, Oracle, mysql, Oracle, hana Connector Add the following configuration to dc.properties:

Add the following configuration to oracle.properties, mysql.properties, hana.properties:

4.13 Setting UDF PushDown Parameters

Config. properties for all Hetu nodes add the following configuration, which defaults to true:

4.14 Setting cbo-aggregation Parameters

Config. properties of the Hetu Coordinator node add the following configuration, which is disabled by default:

Session configuration: Set Session sort_BASed_AGGREgation_enabled =true Related parameters: PRCNT_drivers_for_PARTIal_aggr =5

05 Parameter optimization

5.1 Hetu Core Parameters

5.1.1 Table 5-1 JVM. properties parameter Settings

5.1.2 Table 5-2 Config. properties Parameter Settings

5.2 Connector Parameters 5.2.1 Hive Parameters:

5.2.2 DM related optimization:

06 FQA

1. Connect to FI Hive

HDFS – site. XML file “DFS. Client. Failover. Proxy. Provider. Hacluster” configuration values into “. Org. Apache hadoop. HDFS. Server. The namenode. Ha. The Configur EdFailoverProxyProvider “, as follows: dfs.client.failover.proxy.provider.hacluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

LK failed to delete the large table, actually the table has been deleted

Solutions: if table is large, during drop it tries to delete partitions and all, which might be taking more than 10s , if that time is expire, then timeout happens .. in the backend table is dropped but due to timeout failure propagated error becomes different “table not found”. though its deleted in the same request

Hive. Properties Add the following parameters: hive. Metastore-timeout =60s

Three, 20210306 _101831_00184_wx7j4 Query failed: Unable to create the input format org.. Apache hadoop. Mapred. TextInputFormat

Solution: Package the missing hadoop-plugins-1.0.jar into the Hive Connector plugin.

— openLooKeng, Big Data Simplified ★

OpenLooKeng official website openLookeng. IO

# WeChat public | openLooKeng

Community little helper | openLooKengoss

Scan for us

OpenLooKeng wechat official account

Please contact openLooKeng for reprint of this article