Introduction: This article focuses on how to update data through insert Overwrite
background
For most storage formats in big data, supporting random updates is complex. Transactional tables can support UPDATE and DELETE statements. However, update and DELETE functions are not suitable for frequent updates, data deletion, or real-time writes to target tables. Update and DELETE cannot be performed for non-transactional tables. This article focuses on how to update data through insert Overwrite.
1. Create a table and insert data
create table update_table(ID int,
tranValue string,
last_update_user string) PARTITIONED by(dt STRING ) LIFECYCLE 1;
INSERT INTO update_table PARTITION (dt="20210510") VALUES
(1, 'value_01', 'creation'),
(2, 'value_02', 'creation'),
(3, 'value_03', 'creation'),
(4, 'value_04', 'creation'),
(5, 'value_05', 'creation'),
(6, 'value_06', 'creation'),
(7, 'value_07', 'creation'),
(8, 'value_08', 'creation'),
(9, 'value_09', 'creation'),
(10, 'value_10','creation');
Copy the code
2. Update a data
When the id is 1, it is updated to value_011
INSERT OVERWRITE TABLE update_table PARTITION(dt) SELECT ID,CASE WHEN ID =1 THEN "value_011" ELSE TranValue END TranValue ,last_update_user ,dt FROM update_table WHERE dt = "20210510" ;Copy the code
3. Update multiple pieces of data
According to the delta table update, the delta table insert data is first created
create table update_table_inc(ID int,
TranValue string,
last_update_user string) LIFECYCLE 1;
INSERT INTO update_table_inc VALUES
(5, 'value_11', 'creation'),
(6, NULL, '20170410'),
(7, 'value22', '20170413');
Copy the code
Update TranValue with id 5 and id 7, TranValue 6 is null
INSERT OVERWRITE TABLE update_table PARTITION( dt)
SELECT a.id
,CASE WHEN a.id=b.id and b.TranValue is not null THEN b.TranValue
ELSE a.TranValue
END TranValue
,CASE WHEN a.id=b.id and b.TranValue is not null THEN b.last_update_user
ELSE a.last_update_user
END last_update_user
,dt
FROM update_table a
LEFT JOIN update_table_inc b
ON a.id = b.id
WHERE a.dt = "20210510"
;
Copy the code
4. Delete data
INSERT OVERWRITE TABLE update_table PARTITION(dt) SELECT * FROM update_table WHERE dt = "20210510" and id! = 4;Copy the code
The original link
This article is the original content of Aliyun and shall not be reproduced without permission.