PostgreSQL contains a number of important features. Many of them are very well known. Others can be very useful, but not widely appreciated. Here are our preferred PostgreSQL features that you probably haven’t looked at carefully, but should, because they can help you get your code into production faster, make things easier, and often get things done with less code and effort.

Publish/subscribe notifications

PostgreSQL comes with a simple, non-persistent topic-based publish-subscribe notification system. It’s not Kafka, but the functionality does support common use cases.

Messages about a particular topic can be broadcast to subscribers who are listening on all connections to that topic. These messages are pushed by the Postgres server to the listening client. Polling is not required, but your database driver should support asynchronous delivery of notifications to your application.

The notification consists of the topic name and payload (up to approximately 8000 characters). The payload is usually a JSON string, but of course it can be anything. You can use the NOTIFY command to send notifications:

NOTIFY'foo_events', '{userID: 42, 'action' : 'grok'}'Copy the code

Or pg_notify () :

SELECT pg_notify ('foo_events', '{" userID ": 42," action ":" grok "}');Copy the code

Subscription takes place in the LISTEN command, but you usually have to use driver-specific apis. Here’s an example of the go version.


ljwheyxy

Other Translations (1)

Table inheritance

Let’s say we have a sheet called invoices. You now want to support “government invoices”, which add some fields on the original invoices. How do you model it? Do I add several nullable fields on the invoices or a nullable JSON field? Try inheritance:

CREATE TABLE invoices (
    invoice_number   int  NOT NULL PRIMARY KEY,
    issued_on        date NOT NULL DEFAULT now()
);

CREATE TABLE government_invoices (
    department_id    text NOT NULL
) INHERITS (invoices);Copy the code

The above model reflects the situation that government invoice is invoice but has more attributes than invoice. The above “government_invoices” table has 3 columns:

test=# \d invoices Table "public.invoices " Column | Type | Collation | Nullable | Default ----------------+---------+-----------+----------+--------- invoice_number | integer | | not null | issued_on | date | |  not null | now() Indexes: "invoices_pkey " PRIMARY KEY, btree (invoice_number) Number of child tables: 1 (Use \d+ to list them.) test=# \d government_invoices Table "public.government_invoices " Column | Type | Collation | Nullable | Default ----------------+---------+-----------+----------+--------- invoice_number | integer | | not null | issued_on | date | | not null | now() department_id | text | | not null | Inherits: invoicesCopy the code

Add rows to it as if it were a standalone table:

INSERT INTO invoices (invoice_number) VALUES (100);

INSERT INTO government_invoices
    (invoice_number, department_id) VALUES (101, 'DOD');Copy the code

But look at the situation with SELECT:

test=# SELECT * FROM government_invoices;
 invoice_number | issued_on  | department_id
----------------+------------+---------------
            101 | 2018-06-19 | DOD
(1 row)

test=# SELECT * FROM invoices;
 invoice_number | issued_on
----------------+------------
            100 | 2018-06-19
            101 | 2018-06-19
(2 rows)Copy the code

The invoice number 101 added to the child table is also listed in the parent table. The advantage of this is that the various algorithms In the parent table completely ignore the existence of child tables.

Learn more about PostgreSQL inheritance from this document.


The book

Other Translations (1)

External data wrappers

Did you know that you can have a virtual table that points to another PostgreSQL instance? Or another SQLite, MongoDB, Redis or even another database? This feature, called the External Data Wrapper (FDW), provides a standardized way to access and manipulate external data sources connected to the Postgres server. There are various FDW implementations that allow you to connect to different data sources, and they are often packaged as extensions.

The standard Postgres distribution includes a Postgres_FDW extension that allows you to connect to other Postgres servers. For example, you can move a large table to another server while creating a virtual table locally (the correct term is “external table “):

-- install the extension (required only once)
CREATE EXTENSION postgres_fdw;

-- big_server is our big, remote server with the migrated table
CREATE SERVER big_server FOREIGN DATA WRAPPER postgres_fdw
  OPTIONS (host '10.1.2.3', dbname 'big_db');

-- create a user mapping for the app user
CREATE USER MAPPING FOR app_user SERVER big_server
  OPTIONS (user 'remote_user', password 'remote_pass');

-- and finally create the virtual table invoices -> big_db.public.invoices
CREATE FOREIGN TABLE invoices (
  invoice_num int NOT NULL PRIMARY KEY
  -- other columns omitted for brevity
) SERVER big_server;Copy the code

This Wiki has a good list of many effective implementations of FDW.

In addition to accessing data from other servers, FDW is also used to implement an interactive storage layer, such as cstore_FDW.

There is also a DBLINK extension, which is another implementation for accessing remote PostgreSQL data.


xiaoaiwhc1

Down the table

Starting with version 10, PostgreSQL has native support for splitting a table into subtables based on a calculation of one or more columns of data. This feature allows a large table to be physically stored across multiple tables, improving DML performance and storage management.

Here’s how to create a split table, which adds a table to each month’s data:

-- the parent table CREATE TABLE invoices ( invoice_number int NOT NULL, issued_on date NOT NULL DEFAULT now() ) PARTITION BY RANGE (issued_on); -- table for the month of May 2018 CREATE TABLE invoices_2018_05 PARTITION OF invoices FOR VALUES FROM ('2018-05-01') TO (' 2018-06-01 '); -- table for the month of June 2018 CREATE TABLE invoices_2018_06 PARTITION OF invoices FOR VALUES FROM ('2018-06-01') TO ('2018-07-01');Copy the code

Subtables must be created manually or programmatically, and this creation process does not occur automatically.

You can query or insert data into a parent table. PostgreSQL automatically inserts data into a child table.

Insert two rows of data:

test=# INSERT INTO invoices VALUES (10042, '2018-05-15');
INSERT 0 1
test=# INSERT INTO invoices VALUES (43029, '2018-06-15');
INSERT 0 1Copy the code

You can see that the data is actually inserted into the child table:

test=# SELECT * FROM invoices_2018_05;
 invoice_number | issued_on
----------------+------------
          10042 | 2018-05-15
(1 row)

test=# SELECT * FROM invoices_2018_06;
 invoice_number | issued_on
----------------+------------
          43029 | 2018-06-15
(1 row)Copy the code

But you can also complete the query in the parent table and return the result of the merge:

test=# SELECT * FROM invoices;
 invoice_number | issued_on
----------------+------------
          10042 | 2018-05-15
          43029 | 2018-06-15
(2 rows)Copy the code

The split method is similar to inheritance (querying at the parent table level), but there are some differences (for example, no data is saved in the split parent table). You can read more about this in this document.

PostgreSQL 11, which is already in Beta, will improve this feature, as described in this article.


The book

Other Translations (1)

Interval type

Have you dealt with temperature ranges, schedules, price ranges, or similar numerical ranges before? If so, you have the experience that seemingly simple problems cause you to scratch your head and debug bugs late at night. Here is a table with interval columns and some values:

CREATE TABLE prices ( item text, price int4range -- int4range is a range of regular integers ); INSERT INTO prices VALUES ('mouse', '[10,16)'); INSERT INTO prices VALUES ('mouse', '[10,16)'); INSERT INTO prices VALUES ('joystick', '55 ');Copy the code

The numbers in the wrong formula brackets represent the half-open interval. Here is a query that finds all items in the price range from $15 to $30, using the && operator (interval interleaving):

Test =# SELECT * FROM price&&int4range (15,30); The item | price -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- mouse | [10 dec) the rid_device_info_keyboard | [20, 31) (2 rows)Copy the code

To impress you, try how hard it is to use an interval-free query (just try).

Interval types are very powerful – there are also operators, functions, and you can define your own interval types and even index them.

To learn more about intervals, you can check out this article, and this one.


xiaoaiwhc1

Other Translations (1)

An array type

PostgreSQL has supported array types for a long time. Array types streamline application code and simplify queries. Here is an example of using an array column in a table:

CREATE TABLE posts (
    title text NOT NULL PRIMARY KEY,
    tags  text[]
);Copy the code

Assuming that each line represents a blog and each blog has a set of tags, here’s how we would list all the blogs with the “Postgres” and “go “tags:

test=# SELECT title, tags FROM posts WHERE '{"postgres ", "go "}' <@ tags;
               title               |          tags
-----------------------------------+------------------------
 Writing PostgreSQL Triggers in Go | {postgres,triggers,go}
(1 row)Copy the code

The use of array types here makes our data model more precise and also simplifies the query operation. The Postgres array always comes with operators and functions, including collection functions. You can also create indexes based on array expressions. Here’s an article on how to use arrays in Go.


xiaoaiwhc1

The trigger

When inserting, updating, or deleting rows from a table, you can ask PostgreSQL to execute a special function that can even modify values during insertion. You can learn more about triggers here. Here’s an example: When a user is created, the trigger issues a notification and writes to the audit log.

-- a table of users
CREATE TABLE users (
  username text NOT NULL PRIMARY KEY
);

-- an audit log
CREATE TABLE audit_log (
  at          timestamptz NOT NULL DEFAULT now(),
  description text NOT NULL
);

-- the actual function that is executed per insert
CREATE FUNCTION on_user_added() RETURNS TRIGGER AS ?
BEGIN
  IF (TG_OP = 'INSERT') THEN
    -- add an entry into the audit log
    INSERT INTO audit_log (description)
        VALUES ('new user created, username is ' || NEW.username);
    -- send a notification
    PERFORM pg_notify('usercreated', NEW.username);
  END IF;
  RETURN NULL;
END;
? LANGUAGE plpgsql;

-- set the function as an insert trigger
CREATE TRIGGER on_user_added AFTER INSERT ON users
  FOR EACH ROW EXECUTE PROCEDURE on_user_added();Copy the code

Now, if you try to add a new user, an audit log will be added automatically.

test=# INSERT INTO users VALUES ('alice'); INSERT 0 1 test=# SELECT * FROM audit_log; At | description -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the 2018-06-19 04:00:30. 672947 + 00 | new user created, username is alice (1 row)Copy the code


xiaoaiwhc1

Other Translations (1)

pg_stat_statements

Pg_stat_statements is an extension that is included in the PostgreSQL distribution by default, but is not enabled by default. This extension records health information for each execution statement, including execution duration, memory usage, disk IO initialization, and so on. It is an indispensable extension for scenarios where you need to understand and debug query performance.

The overhead of installing and enabling this extension is minimal, and it is easy to use, so there is no reason not to use this extension in your production server.


xiaoaiwhc1

Hashes, GIN and BRIN indexes

The default index type in PostgreSQL is B-tree, and there are other types as well. Other index types are useful in very unusual situations. In particular, setting up hashes, GIN and BRIN type indexes may just solve your performance problems:

  • Hash: Unlike b-tree indexes with inherent sorting, hash indexes are unordered and can only perform equality matching (lookups). However, hash indexes take up less space and are faster than equally-matched B-trees. (Also, note that prior to PostgreSQL 10, it was not possible to copy hash indexes; They are not recorded.)

  • GIN: GIN is an inverted index that basically allows multiple values for a single key. GIN indexes are useful for indexed arrays, JSON, ranges, full-text searches, and more.

  • BRIN: If your data has a particular natural order (such as time series data), and your queries typically only apply to a small range of them, then a BRIN index can speed up queries with little overhead. The BRIN index maintains the scope of each data block, allowing the optimizer to skip blocks containing rows that are not selected by the query. Start reading about PostgreSQL index types here.


lnovonl

Other Translations (1)

Full-text search

PostgreSQL also supports full-text search well, even in languages other than English. Here is a step-by-step tutorial on how to create a full-text search query in Go based on PostgreSQL.


xiaoaiwhc1

Other Translations (2)