Description: The Kettle is an open-source ETL tool implemented in Java and runs on Windows, Unix, and Linux. It provides a graphical user interface (GUI) and allows you to define data transfer topology by dragging and dropping controls. This chapter introduces the Kettle based MaxCompute plug-in to implement data on the cloud.
Kettle version: 8.2.0.0-342
MaxCompute JDBC driver version: 3.2.8
Setup
- Download and install Kettle
- Download MaxCompute JDBC driver
- Set the MaxCompute JDBC driver to the lib subdirectory in the Kettle installation directory (data-integration/lib).
- Download and compile the MaxCompute Kettle Plugin: github.com/aliyun/aliy…
- Put the compiled MaxCompute Kettle plugin into the lib subdirectory in the Kettle installation directory (data-integration/lib).
- Start the spoon,
Job
We can use Kettle + MaxCompute JDBC driver to organize and execute tasks in MaxCompute.
You need to perform the following operations:
- The new Job
- Create a Database Connection. The JDBC Connection string format is JDBC: ODPS 😕 Project = JDBC driver class: Com. Aliyun. Odps. JDBC. OdpsDriver Username for ali cloud the AccessKey Id Password for ali cloud the AccessKey Secret JDBC see more configuration: Help.aliyun.com/document\_d…
After that, MaxCompute can be accessed through SQL nodes based on business needs. Let’s take a simple ETL process as an example:
The configuration of the Create table node is as follows:
Note:
- Here Connection needs to be selected as configured
- Deselect Send SQL as single Statement
Load from OSS:
Note the same points as the Create table node. For more uses of Load, see: help.aliyun.com/document\_d…
The configuration of the Processing node is as follows:
Note the same points as the Create table node.
Transformation
The MaxCompute Kettle Plugin allows data to flow out of or into MaxCompute.
Create Transformation and Aliyun MaxCompute Input as follows:
Create an empty table in MaxCompute with the same schema as test_partition_table.
Create an Aliyun MaxCompute Output node and configure it as follows:
When Transformation is executed, data is downloaded from test_partition_table and then uploaded to test_partition_table_2.
other
Buy MaxCompute flags set
Before executing DDL/DML/SQL, set key=value; To configure flags.
Script mode
Temporarily unsupported
The original link
This article is the original content of Aliyun and shall not be reproduced without permission.