Early outlook for Postgres 14: Performance and monitoring improvements

By Lukas Fittl, May 21, 2021

The original link: pganalyze.com/blog/postgr…

The first beta of the upcoming Postgres 14 release was released yesterday. In this article, we’ll first take a look at what’s in the beta, focusing on one major performance improvement, and three monitoring improvements that caught our attention.

Before we begin, I want to emphasize what I have always considered an important and unique aspect of Postgres. In contrast to most other open source database systems, Postgres is not the project of one company, but many people working together on new versions, year after year. This includes everyone who tries out the beta and reports bugs to the Postgres project. We hope this article inspires you to do your own tests and benchmarks.

Now, I’m personally very excited about the better connectivity extensions in Postgres 14. In this article, we do a detailed baseline comparison of Postgres 13.3 and 14 beta1 (note that the number of connections is logarithmic scale).

  • Improved active and idle connection extension in Postgres 14
  • Use PG_backend_memory_contexts to dig deeper into memory usage
  • Use pg_stat_WAL to track WAL activity
  • Monitor queries with the built-in Postgres Query_id
  • And more than 200 other improvements in the Postgres 14 version!
  • conclusion

Improved scaling for active and idle connections in Postgres 14

Postgres 14 brings significant improvements for those of us who need a lot of database connections. Postgres’s connection model relies on processes rather than threads. This has some important benefits, but it also has overhead on a large number of connections. The extension of active and idle connections has been significantly improved in this new release and will be a significant improvement for the most demanding applications.

In our tests, we used two 96 vCore AWS instances (C5.24 Xlarge), one running Postgres 13.3 and one running Postgres 14 beta1. Both use Ubuntu 20.04, using the default system Settings, but Postgres’ connection limit has been increased to 11,000 connections.

We use PGBench to test connection extensions for active connections. To start, we use PGBench’s size factor 200 to initialize the database.

# Postgres 13.3 $pgbench i-s 200... Complete in 127.71 seconds (Drop tables 0.02s, Create tables 0.02s, client-side generate 81.74s, VACUUM 2.63s, Primary Keys 43.30s). # Postgres 14 beta1 $ pgbench -i -s 200 ... Complete in 77.33 seconds (drop table 0.02 seconds, create table 0.02 seconds, client generation 48.19 seconds, vacuum 2.70 seconds, primary key 26.40 seconds).Copy the code

Already, we can see that Postgres 14 does a better job of initial data loading.

We now start read-only PGBench with a different set of active connections, showing 5000 concurrent connections as an example of a very active workload.

$pgbench -c 5000-j 96-m prepared -... $pgbench -c 5000-j 96-m prepared TPS = 495108.316805 (not including initial connection time)Copy the code

As you can see, Postgres 14 throughput is about 20% higher at 5000 active connections. At 10,000 active connections, this is a 50% improvement over Postgres 13, and you can also see consistent improvements at lower connection counts.

Note that when the number of connections exceeds the number of cpus, you usually see a significant drop in TPS, which is most likely due to the overhead of CPU scheduling rather than the limitations of Postgres itself. Now, most workloads don’t actually have that many active connections, but a lot of idle connections.

Andres Freund, the original author of this work, benchmarked the throughput of a single active query while running 10,000 idle connections. The query went up from 15,000 TPS to nearly 35,000 TPS– which is more than twice as good as Postgres 13. You can find all the details in **Andres Freund’s original post about these improvements **.

Use PG_backend_memory_contexts to dig deeper into memory usage

Have you ever wondered why a certain Postgres connection takes up more memory? With the new PG_backend_Memory_Contexts view, you can take a closer look at what is assigned to a Postgres process.

First, we can calculate how much total memory is being used for the current connection.

SELECT pg_size_pretty(SUM(used_bytes)) FROM pg_backend_memory_contexts;
 pg_size_pretty 
----------------
 939 kB
(1Line)Copy the code

Now, let’s take a closer look. When we query the first five entries of this table by memory usage, you’ll see that there’s a lot of detail.

SELECT * FROM pg_backend_memory_contexts ORDER BY used_bytes DESC LIMIT 5;
          name | ident | parent | level | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes 
-------------------------+-------+------------------+-------+-------------+---------------+------------+-------------+-- ----------
 CacheMemoryContext | | TopMemoryContext | 1 | 524288 | 7 | 64176 | 0 | 460112The time zone| | TOPMemoryContext | 1 | 104120 | 2 | 2616 | 0 | 101504
 | | | 0 | 68704 | 5 | 13952 | 12 | 54752Build of WAL records| | TopMemoryContext | 1 | 49768 | 2 | 6360 | 0 | 43408
 MessageContext | | | TopMemoryContext | 1 | 65536 | 4 | 22824 | 0 | 42712
(5Line)Copy the code

The memory context of Postgres is an area of memory that is allocated for activities such as query planning or query execution. Once Postgres has finished working in the context, the entire context can be freed, simplifying memory processing. By using the memory context, Postgres’s source code actually avoids manual “free “calls for the most part (even though it’s written in C) and instead relies on the memory context to clean up memory in groups. The top-level memory context, CacheMemoryContext, is used for many of the long-term caches in Postgres.

We can illustrate the impact of loading additional tables into a join by running the query on a new table and then querying the view again.

SELECT * FROM test3;
SELECT * FROM pg_backend_memory_contexts ORDER BY used_bytes DESC LIMIT 5;
          name | ident | parent | level | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes 
-------------------------+-------+------------------+-------+-------------+---------------+------------+-------------+-- ----------
 CacheMemoryContext | | TopMemoryContext | 1 | 524288 | 7 | 61680 | 1 | 462608.Copy the code

As you can see, the new view shows that a table was simply queried on this connection, leaving about 2KB of memory even after the query ended. This caching of table information is intended to speed up future queries, but can sometimes result in a staggering memory footprint for multi-tenant databases with many different schemas. Now you can easily illustrate such issues with this new monitor view.

If you want to access information about processes other than the current one, you can use the new pg_log_backend_memory_contexts function, which will cause the specified process to output its own memory consumption to Postgres logs.

SELECT pg_log_backend_memory_contexts(10377); LOG: records the PID10377Memory context statement of.SELECTPg_log_backend_memory_contexts (pg_backend_pid ()). LOG: level:0; TopMemoryContext: 6All of them80800A;14432A free,5A block);66368LOG: level:1; pgstat TabStatusArray lookup hash table: 8192In total1A block;1408Free (0A block);6784Have used the LOG: level:1; TopTransactionContext:1There are all blocks8192A;7720A free,1A block);472LOG: level:1; RowDescriptionContext: 8192In total1A block;6880Free (0A block);1312Have used the LOG: level:1; MessageContext: 16384In total2A block;5152Free (0A block);11232Have used the LOG: level:1; Operator Class cache: Total8192A,1A block;512A free,0A block);7680Used LOG: level:1; SMGR relationship table. in2All of them16384A;4544A free,3A block);11840Used LOG: level:1; TransactionAbortContext: 32768Total amount in a block;32504Free (0Block);264Has been used... LOG: level:1; ErrorContext: 8192In total1A block;7928Free (3A block);264Used LOG: Total:1651920Bytes in201A block;622360Free (88A block);1029560Has been usedCopy the code

Use pg_stat_WAL to track WAL activity

Building on Postgres 13’s WAL monitoring capabilities, the new release brings a new server-wide summary view of WAL information called “pg_stat_WAL”.

You can use it to monitor WAL writes more easily.

SELECT * FROM pg_stat_wal;
-[ RECORD 1 ]----+------------------------------
Wal_records | 3334645
Wal_fpi | 8480How much data is being read? The cache is full| 799Read and write| 429769Read the time| 428912Write time| 0Read the time| 0
2021- 05- 21 07:33:22.941452+00
Copy the code

With this new view, we can get summary information, such as how many full-page images (FPI) have been written to WAL, which lets you know when Postgres is generating a large number of WAL records due to checkpoints. Second, you can use the new WAL_buffers_full counter to quickly see if the WAL_buffers setting is too low, which can lead to unnecessary I/O, which can be prevented by increasing the value of WAL_buffers.

You can also get more details on the I/O impact of WAL writes by enabling the optional track_WAL_IO_timing setting, which then gives you the exact I/O times when WAL writes and WAL files are synchronized to disk. Note that this setting has significant overhead, so it is best to turn it off (the default) unless necessary.

[! Download free ebooks: how to get 3 times the speed of Postgres] (pganalyze.com/ebooks/opti…).

Use the built-in Postgres Query_id to monitor queries

In a recent TimescaleDB survey conducted in March and April 2021, the PG_STAT_statements extension was rated as one of the top three extensions to Postgres used by the surveyed user base. Pg_stat_statements is bundled with Postgres, and in Postgres 14, an important function of the extension was incorporated into the Postgres core.

Calculates “query_id”, which is the unique identity of a query, ignoring constant values. So, if you run the same query again, it will have the same Query_ID ‘, enabling you to identify the workload pattern of the database. Previously, this information could only be obtained through PG_stat_statements, which showed summary statistics about queries that had been executed, but now this information can be obtained through PG_stat_activity ‘as well as log files.

First, we must enable the new compute_QUERy_ID setting and then restart Postgres.

ALTER SYSTEM SET compute_query_id = 'on';
Copy the code

If you use PG_stat_statements, the query ID will be calculated automatically, with the default compute_query_id set to auto.

With the query ID enabled, we can look at PG_stat_activity during the PGbench run to see why this is more helpful than just looking at the query text.

SELECT query, query_id FROM pg_stat_activity WHERE backend_type = 'client backend' LIMIT 5;
                                 query | query_id      
------------------------------------------------------------------------+--------------------
 UPDATE pgbench_tellers SET tbalance = tbalance + - 4416. WHERE tid = 3; | 885704527939071629
 UPDATE pgbench_tellers SET tbalance = tbalance + - 2979. WHERE tid = 10; | 885704527939071629
 UPDATE pgbench_tellers SET tbalance = tbalance + 2560 WHERE tid = 6; | 885704527939071629
 UPDATE pgbench_tellers SET tbalance = tbalance + - 65. WHERE tid = 7; | 885704527939071629
 UPDATE pgbench_tellers SET tbalance = tbalance + - 136. WHERE tid = 9; | 885704527939071629
(5Line)Copy the code

From an application perspective, all of these queries are the same, but their text is slightly different, which makes it difficult to find patterns in the workload. However, with query ids, we can clearly identify the number of certain types of queries and more easily assess performance issues. For example, we can group by querying ids to see what is keeping the database busy.

“sql SELECT COUNT(*), state, query_id FROM pg_stat_activity WHERE backend_type = ‘client backend’ GROUP BY 2, 3; Count | state | query_id — — — — — — — + + — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — 40 | | 885704527939071629 | | 9 active activities 7660508830961861980 | | | | – 7810315603562552972-1 active active – 3907106720789821134 (4)

When you run it on your own system, you may find that the query ID is different from the one shown here. This is because the query ID relies on the internal representation of the Postgres query, which may be schema dependent, and also takes into account the internal ID of the table rather than the name of the table. Query ID information is also available in 'log_line_prefix' with the new %Q option, which makes it easier to get auto-explain output related to the query. Text 2021-05-21 08:18:02.949 UTC [7176] [user=postgres,db=postgres,app=pgbench,query=885704527939071629] LOG: Duration: 59.827MS plan. Query text. UPDATE pgbench_tellers SET tbalance = tbalance + -1902 WHERE tid = 6; Update pgbench_tellers (cost = 4.14... Rows =0 width=0) Rows =0 loops=1) -> Bitmap Heap Scan on pgbench_tellers (cost= 4.14.8.16 rows=1 width=10) (actual time=0.009. 0.011 rows=1 loops=1) Recheck conditions. (tid = 6) Heap blocks: exact=1 -> Perform bitmap index scan on pgbench_tellerS_pkey (cost=0.00... Rows =1 width=0) 0.004 rows=1 loops=1 (tid = 6)Copy the code

** Want to connect auto_EXPLAIN and PG_stat_statements, but can’t wait for Postgres 14?

We have built our own open source query fingerprinting mechanism that uniquely identifies queries based on their text. This is used in PGAnalyze to match EXPLAIN plans with queries, or you can use it in your own scripts, for any Postgres version.

There are more than 200 other improvements in Postgres 14!

These are just some of the many improvements in the new version of Postgres. You can find more new content in the release notes, for example.

  • New predefined rolespg_read_all_data/pg_write_all_dataGive global read or write permission
  • If the client is disconnected, the long-running query is automatically cancelled
  • Vacuum will now skip index vacuuming when the number of index items to remove is low.
  • The information for each index is now included in the auto-vacuuming log output
  • ALTER TABLE… “Detach partitions in a non-blocking manner. Unpartition… CONCURRENTLY

There’s more. Now it’s time to help test!

Download Beta1 from the official package library, or build it from source code. We can all contribute to making Postgres 14 a stable release in a few months.

conclusion

At PGAnalyze, we are excited about Postgres 14, and hope this article will also interest you! Once again, Postgres has demonstrated many small improvements that make it a stable, trustworthy database built by and for the community.