- π’ welcome to like: π collect β message π if there are mistakes please correct, give people rose, hand left lingering fragrance!
- π’ This article was originally written by Webmote and originally published by Nuggets.
- π’ author’s motto: life is toss about, when you don’t toss about life, life will start to toss about you, let us come on together! πͺ πͺ πͺ
The preface
To do great things, you need to learn to break down goals.
Insert 100 million, can be divided into 10 insert 1000W data, which can be divided into 100 insert 100W data…
So let’s start small and say, how do YOU insert 100 pieces of data?
1. Insert hundreds of pieces of data in batches
Bulk inserts are easy for most databases. The simplest way to do this is to write an INSERT into statement that can handle 100 inserts in a flash through simple copying and copying.
Then, with the help of SQL’s interface management tools, such as the Navicat family, we can execute these statements quickly, producing hundreds or thousands of data volumes in multiple executions.
So how do you handle millions of data points? Are you tired of clicking all the time?
2. Batch insert millions of pieces of data
Almost all databases have the function of stored procedure, which supports certain syntax in the stored procedure to support the for loop. We just need to use this function to write our data to the stored procedure, then millions or even tens of millions of data is not a problem.
CREATE OR REPLACE PROCEDURE "public"."InsertChar10001"()
AS $BODY$BEGIN
declare
campaignId varchar(32) :=' ';
autoInvitationId varchar(32) :=' ';
siteId int:=0;
sentNum int:=0;
acceptNum int:=0;
refuseNum int:=0;
logTime timestamptz:='2010-01-01 00:00:00. 000';
rand int:=0;
ncount int:=0;
ifQueued int:=ceil(random()*(2- 1)+1);
status int:=ceil(random()*(5- 1)+1);
nsource int:=0;
preChatSocialMediaSource int:=0;
preChatTicketId int:=0;
visitorMessagesCount int:=0;
ii int;
ss timestamptz:=logtime;
randTime varchar(50) := to_char(logTime, 'YYYY-MM-DD HH24:MI:SS');
begin
ii:=1;
for ii in 1.1000000. loop
INSERT INTO "T_TEST"
("Id","SessionId"
,"StartTime"
,"EndTime"
)
select
uuid_generate_v4() Id,
uuid_generate_v4() SessionId
,logTime StartTime
,logtime + interval '100 second' EndTime
;
end loop;
end;
END$BODY$
LANGUAGE plpgsql
Copy the code
As long as we adjust the parameter for ii in 1.. 1000000, then we can freely customize the size of the data volume and accurately generate random data into the table.
However, after 1000W, many databases have problems. The speed is very slow, and some even need several hours to generate data.
So what to do?
3. Magic bullet
Insert into… select … . Almost all databases support it, and it’s fast!
Use this method to double the table data, such as 100W to 200W, 200W to 400W… The only drawback is that there is a lot of duplicate data.
INSERT INTO "T_TEST"
("Id",
"SessionId"
,"StartTime"
,"EndTime")
select uuid_generate_v4() "Id",
"SessionId"
,"StartTime"
,"EndTime" from "T_TEST"
Copy the code
If you want to replicate the exact number of rows, you can specify the amount of data by using a method like select top N, or where conditions, etc.
Note that your database capacity needs to be sufficient oh, look at my error.
4. Summary
The update was completed in August, actually it was quite difficult, I didn’t know what to write at the end, the generation speed couldn’t keep up.
Routine summary, rational view!
Knot is what ah, knot is I want you to praise but can not get lonely. π³ π³ π³
π have seen this, but also care about a thumbs-up?
π has been liked, but also care about a collection?
π are collected, but also care about a comment?