Pgbench introduction:

Pgbench is a simple program that runs benchmarks on PostgreSQL. It might run the same sequence of SQL commands over and over again in a concurrent database session and calculate the average transaction rate (transactions per second). By default, PGBench tests a tPC-B-based but looser scenario involving five SELECT, UPDATE, and INSERT commands per transaction. However, it is easy to test other cases by writing your own transaction script files.

Method of use

pgbench [OPTION]... [DBNAME]Copy the code

Initialization options

-i, --initialize calls the initialization mode -f, --fillfactor=NUM Sets the filling factor -n, --no-vacuum does not run vacuum -q after initialization, --quiet quiet mode -s, --scale=NUM scale factorCopy the code

Standard options

-c, --client=NUM Number of concurrent database clients (default :1) -c, --connect Establishes a new connection for each transaction -d, --define=VARNAME=VALUE Defines the variable used by the custom script -f, --file=FILENAME reads transaction script from FILENAME -j, --jobs=NUM Number of threads (default: 1) -l, --log writes to log file -l, --latency-limit=NUM counts transactions lasting more than milliseconds (ms) as latency -m, - protocol = simple | extended | agreement prepared to submit queries (default: simple) - n, don't run before - no - vacuum test vacuum - n, -- skipp-some-updates skip updates to pgbench_tellers and pgbench_branches -p, --progress=NUM Displays thread progress report every specified second -r, --report-latencies Report the average delay of each command -r, --rate=NUM Target rate of transactions per second -s, --scale=NUM Report this scaling factor in the output -s, --select-only Run a read-only query, --transactions=NUM Number of transactions running on each client (default: 10) -t, --time=NUM Baseline test duration (s) -v, --vacuum-all Vacuum all four standard tables before the testCopy the code

Hardware environment

The number of Three Servers
model Huawei 2488 h V5
CPU 4 x 16-core Intel Xeon 6130 2.10GHz
storage 2 x 480GB SSD, 22 x 1.2 TB SAS
memory 1T(33*32 GB)  2666MHz DDR4
so Two dual-port GIGABit and two dual-port 10-GIGABit Ethernet cards
Linux distributions Red Hat Enterprise Linux Server release 7.5
The Linux kernel 3.10.0-862. El7 x86_64

Environment configuration

GP version Greenplum6.2.1 Greenplum5.20
Kernel version PostgreSQL 9.4.24 PostgreSQL 8.3.23
Environment configuration Master: MAS01 Segment: MAS01, MAS02, Seg08

Setting Cluster Parameters

Greenplum6

gpconfig -c 'optimizer' -v off
gpconfig -c 'gp_enable_global_deadlock_detector' -v on
gpconfig -c 'resource_scheduler' -v off
gpconfig -c log_statement -v none
gpconfig -c checkpoint_segments -v 2 --skipvalidationCopy the code

Greenplum5

gpconfig -c 'optimizer' -v off
gpconfig -c 'resource_scheduler' -v off
gpconfig -c log_statement -v none
gpconfig -c checkpoint_segments -v 2 --skipvalidationCopy the code

Parameters that

gp_enable_global_deadlock_detectorCopy the code

This GUC controls whether to enable global deadlock detection. In Greenplum 6, it is disabled by default and needs to be enabled to support concurrent update/delete operations. Greenplum 5 does not support this GUC.

log_statementCopy the code

This GUC reduces unnecessary logs and prevents log output from interfering with I/O performance.

checkpoint_segmentsCopy the code

This GUC affects the checkpoint disk flush frequency. The default value 8 reduces the disk flush frequency. However, a large amount of data is flushed each time, resulting in instantaneous performance deterioration of the entire cluster. For a large number of OLTP update statements, reducing this setting will increase the disk flush frequency, but the average performance will be significantly improved as the data amount of each disk flush becomes smaller. Greenplum 5 supports this GUC but has no significant effect because Greenplum 5’s performance bottleneck is not I/O but serialization due to table locks.

The test method

The test library was established in the database, and pgBench was used to conduct the test. The data scale was 1000 times, and the test lasted for 60 seconds.

  • Initialization command
pgbench -h mas01 -U gpadmin6 -p 6666 -i -s 1000 testCopy the code

  • Benchmark command
pgbench -h mas01 -U $user -p $port -c $N -T 60 -r testCopy the code
  • Single query test command
pgbench -h mas01 -U $user -p $port -c $N -S -T 60 -r testCopy the code
  • Single update test command
pgbench -h mas01 -U $user -p $port -c $N -S -T 60 -r test -f update.sql
# vi update.sql
\set naccounts 100000 * :scale
\setrandom aid 1 :naccounts
\setrandom delta -5000 5000
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
END;Copy the code
  • Single insert test command
pgbench -h mas01 -U $user -p $port -c $N -S -T 60 -r test -f insert.sql
# vi insert.sql
\set nbranches 1 * :scale
\set ntellers 10 * :scale
\set naccounts 100000 * :scale
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom tid 1 :ntellers
\setrandom delta -5000 5000
BEGIN;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
END;Copy the code

conclusion

During the execution, it was observed that Greenplum 5 had a large number of locks when update was involved. In the case of single query and single insert, the conflict of Lwlock (system shared resource lock) resulted from the failure of timely commit due to two transactions. Greenplum 6 introduces global deadlock detection to support concurrent updates of HEAP tables, which solves this problem. In concurrent update tests, lwlocks on the master are not available, while lwlocks on nodes are available but rarely add up to a few, thus greatly improving performance.

What would happen if it was an AO table? Alter table pgbench_accounts; alter table PGbench_accounts;

test=# create table pgbench_accounts_ao(like pgbench_accounts)WITH (appendonly=true,compresstype=zlib,COMPRESSLEVEL=5);
test=# insert into pgbench_accounts_ao select * from pgbench_accounts;
test=# alter table pgbench_accounts rename to pgbench_accounts_bak;
test=# alter table pgbench_accounts_ao rename to pgbench_accounts;
test=# vacuum analyze pgbench_accounts;Copy the code

Sure enough, there was a problem with updating the AO table, only one process was observed running, the others were waiting for locks.

Insert concurrent test to AO table:

create table pgbench_history_ao(like pgbench_history)WITH (appendonly=true,compresstype=zlib,COMPRESSLEVEL=5);
insert into pgbench_history_ao select * from pgbench_history;
alter table pgbench_history rename to pgbench_history_bak;
alter table pgbench_history_ao rename to pgbench_history;
vacuum analyze pgbench_history;Copy the code

Performance degradation is obvious and AO tables are not suitable for frequent inserts.

Query concurrent test on AO table:


As TPS is also low, it seems that GP6’s OLTP boost only works on the heap table.

We then optimized Greenplum6’s parameters, using the TPC-B benchmark as an example, and adjusted for 64 cores on the CPU as the number of concurrent requests

In order to reduce Greenplum5’s isolation from the PGbench thread parameter, adjust the PGbench command to enable the thread — j parameter, using the value 16.

The test command

pgbench -h mas01 -U gpadmin6 -p 6666 -c 64 -j 16 -T 30 -r testCopy the code

The test results


2. Adjust shared_buffers to store shared data to the memory.

gpconfig -c shared_buffers -v '2GB'Copy the code

The test results


3. Adjust transaction submission parameters so that WAL is not forcibly written to disk but only written to cache will be returned to client successfully. Wal_writer_delay *3 ms writing to disk can improve TPS but may cause transaction loss risk.

gpconfig -c synchronous_commit -v offCopy the code

The test results


4, turn off persistent calls, do not force to flush data to disk, in case of power outages or system problems in the risk of data loss.

Gpconfig -c fsync -v 'off' -skipValidationCopy the code

The test results


Add the -j $N parameter to the previous comparison test command to start the thread and test Greenplum6 with the current parameter setting

RT (average response time) in milliseconds

Screenshot of benchmark test results

Based on the previous test command, add the -j $N parameter to start the thread, and test the result under the current parameter setting

Screenshot of benchmark test results





Screenshot of single query test results






Single update test result screenshot






Screenshot of single insert test results






For more technical know-how on Greenpum, please visit Greenplum’s Chinese community website