By Alexey Vasiliev
Shi Yanliang, working in Hangao Basic Software Co., LTD., PostgreSQL database technology enthusiast, PostgreSQL ACE, PGCM, 10G & 11G OCM, OGG certification expert.
The original address: leopard. In. Ua / 2013/09/02 /…
In this article, you’ll learn how to use PostgreSQL’s LTREE module, which allows you to store data in a hierarchical tree structure.
What is LTREE?
Ltree is a PostgreSQL module. It implements a data type, LTREE, used to represent labels for data stored in a hierarchical tree structure. Provides extensive tools for searching tag trees.
Why ltREE?
- Ltree implements a materialized path that is fast for INSERT/UPDATE/DELETE and fast for SELECT operations
- In general, it is faster than using recursive Ctes or recursive functions that often require recalculation of branches
- Such as built-in query syntax and operators dedicated to query and navigation trees
- Index!!
The initial data
First, you should enable extensions in the database. You can perform this operation by running the following command:
CREATE EXTENSION ltree;
Copy the code
Let’s create a table and add some data to it:
CREATE TABLE comments (user_id integer, description text, path ltree); INSERT INTO comments (user_id, description, path) VALUES ( 1, md5(random()::text), '0001'); INSERT INTO comments (user_id, description, path) VALUES (2, MD5 (random()::text), '0001.0001.0001'); INSERT INTO comments (user_id, description, path) VALUES (2, MD5 (random()::text), '0001.0001.0001.0001'); INSERT INTO comments (user_id, description, path) VALUES (2, MD5 (random()::text), '0001.0001.0001.0001'); INSERT INTO comments (user_id, description, path) VALUES (1, MD5 (random()::text), '0001.0001.0001.0002'); INSERT INTO comments (user_id, description, path) VALUES (1, MD5 (random()::text), '0001.0001.0001.0002'); INSERT INTO comments (user_id, description, path) VALUES (5, MD5 (random()::text), '0001.0001.0001.0001.0003 '); INSERT INTO comments (user_id, description, path) VALUES (5, MD5 (random()::text),' 0001.0001.0001.0001.0003 '); INSERT INTO comments (user_id, description, path) VALUES (6, MD5 (random()::text), '0001.0002'); INSERT INTO comments (user_id, description, path) VALUES (6, MD5 (random()::text), '0001.0002'); INSERT INTO comments (user_id, description, path) VALUES (6, MD5 (random()::text), '0001.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES (6, MD5 (random()::text), '0001.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES (6, MD5 (random()::text), '0001.0003'); INSERT INTO comments (user_id, description, path) VALUES (6, MD5 (random()::text), '0001.0003'); INSERT INTO comments (user_id, description, path) VALUES (8, MD5 (random()::text), '0001.0003.0001'); INSERT INTO comments (user_id, description, path) VALUES (8, MD5 (random()::text), '0001.0003.0001'); INSERT INTO comments (user_id, description, path) VALUES (9, MD5 (random()::text), '0001.0003.0002'); INSERT INTO comments (user_id, description, path) VALUES (9, MD5 (random()::text), '0001.0003.0002'); INSERT INTO comments (user_id, description, path) VALUES (11, MD5 (random()::text), '0001.0003.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES (11, MD5 (random()::text), '0001.0003.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES (2, MD5 (random()::text), '0001.0003.0002.0002'); INSERT INTO comments (user_id, description, path) VALUES (5, MD5 (random()::text), '0001.0003.0002.0003'); INSERT INTO comments (user_id, description, path) VALUES (7, MD5 (random()::text), '0001.0003.0002.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES (7, MD5 (random()::text), '0001.0003.0002.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES (20, MD5 (random()::text), '0001.0003.0002.0002.0002'); INSERT INTO comments (user_id, description, path) VALUES (31, MD5 (random()::text), '0001.0003.0002.0002.0003'); INSERT INTO comments (user_id, description, path) VALUES (31, MD5 (random()::text), '0001.0003.0002.0002.0003'); INSERT INTO comments (user_id, description, path) VALUES (22, MD5 (random()::text), '0001.0003.0002.0002.0004'); INSERT INTO comments (user_id, description, path) VALUES (22, MD5 (random()::text), '0001.0003.0002.0002.0004'); INSERT INTO comments (user_id, description, path) VALUES (34, MD5 (random()::text), '0001.0003.0002.0002.0005'); INSERT INTO comments (user_id, description, path) VALUES (34, MD5 (random()::text), '0001.0003.0002.0002.0005'); INSERT INTO comments (user_id, description, path) VALUES (22, MD5 (random()::text), '0001.0003.0002.0002.0006'); INSERT INTO comments (user_id, description, path) VALUES (22, MD5 (random()::text), '0001.0003.0002.0002.0006');Copy the code
Also, we should add some indexes:
CREATE INDEX path_gist_comments_idx ON comments USING GIST(path);
CREATE INDEX path_comments_idx ON comments USING btree(path);
Copy the code
As you can see, I set up the Comments table with a PATH field that contains all the tree paths for the table. As you can see, FOR the tree delimiter, I use four numbers and dots.
Let’s find the commenets table with path as’ 0001.0003 ‘:
$SELECT user_id, path FROM comments WHERE path <@ '0001.0003'; User_id | path -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 6 0001.0003 8 | | 0001.0003.0001 9 | | 0001.0003.0002 11 7 | 0001.0003.0002.0001 2 | | 0001.0003.0002.0002 5 0001.0003.0002.0003 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | | 0001.0003.0002.0002.0003 22 0001.0003.0002.0002.0004 34 | | 0001.0003.0002.0002.0005 22 0001.0003.0002.0002.0006 (12 rows)Copy the code
Let’s examine this SQL with the EXPLAIN command:
$EXPLAIN ANALYZE SELECT user_id, path FROM comments WHERE path <@ '0001.0003'; QUERY PLAN ---------------------------------------------------------------------------------------------------- Seq Scan On comments (cost = 0.00.. Rows =2 width= 50) (actual time=0.013.. Loops =1) Filter: (path <@ '0001.0003'::ltree) Rows Removed by Filter: 7 Total Runtime: 0.037ms (4 rows)Copy the code
Let’s disable seQ Scan for testing:
$ SET enable_seqscan=false; SET $EXPLAIN ANALYZE SELECT user_id, path FROM comments WHERE path <@ '0001.0003'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ ----------- Index Scan using path_gist_comments_idx on comments (cost=0.00.. Rows =2 width=38) (actual time=0.023.. Loops =1) Index Cond: (path <@ '0001.0003'::ltree) Total Runtime: 0.0004ms (3 rows)Copy the code
Now SQL is slow, but you can see how SQL uses index. The first SQL statement uses sequence Scan because there is not much data in the table.
Select path <@ ‘0001.0003’;
$SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*'; User_id | path -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 6 0001.0003 8 | | 0001.0003.0001 9 | | 0001.0003.0002 11 7 | 0001.0003.0002.0001 2 | | 0001.0003.0002.0002 5 0001.0003.0002.0003 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | | 0001.0003.0002.0002.0003 22 0001.0003.0002.0002.0004 34 | | 0001.0003.0002.0002.0005 22 0001.0003.0002.0002.0006 (12 rows)Copy the code
You should never forget the order of the data, as in the following example:
$INSERT INTO comments (user_id, description, path) VALUES (9, MD5 (random()::text), '0001.0003.0001.0001'); $INSERT INTO comments (user_id, description, path) VALUES (9, MD5 (random()::text), '0001.0003.0001.0001'); $INSERT INTO comments (user_id, description, path) VALUES (9, MD5 (random()::text), '0001.0003.0001.0002'); $INSERT INTO comments (user_id, description, path) VALUES (9, MD5 (random()::text), '0001.0003.0001.0002'); $INSERT INTO comments (user_id, description, path) VALUES (9, MD5 (random()::text), '0001.0003.0001.0003'); $INSERT INTO comments (user_id, description, path) VALUES (9, MD5 (random()::text), '0001.0003.0001.0003'); $SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*'; User_id | path -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 6 0001.0003 8 | | 0001.0003.0001 9 | | 0001.0003.0002 11 7 | 0001.0003.0002.0001 2 | | 0001.0003.0002.0002 5 0001.0003.0002.0003 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | | 0001.0003.0002.0002.0003 22 0001.0003.0002.0002.0004 34 | | 0001.0003.0002.0002.0005 22 0001.0003.0002.0002.0006 9 | | 0001.0003.0001.0001 9 0001.0003.0001.0002 9 | 0001.0003.0001.0003 (15 rows)Copy the code
Now sort:
$SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*' ORDER by path; User_id | path -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 6 0001.0003 8 | | 0001.0003.0001 9 | | 0001.0003.0001.0001 9 0001.0003.0001.0002 9 | | 0001.0003.0001.0003 9 0001.0003.0002 11 | 2 | 0001.0003.0002.0001 0001.0003.0002.0002 7 | 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | | 0001.0003.0002.0002.0003 22 0001.0003.0002.0002.0004 34 | 0001.0003.0002.0002.0005 22 | | 0001.0003.0002.0002.0006 5 0001.0003.0002.0003 (15 rows)Copy the code
Several modifiers can be added to the end of an lQuery’s non-asterisk tag to make it a better match than a full match:
* “@” – case insensitive matching, for example a @ matches a
“*” – matches any label with that prefix, such as foo * matches foobar
“%” – matches words that begin with an underscore
$SELECT user_id, path FROM comments WHERE the path ~ '0001. * {1, 2} in 0001 | 0002. *' ORDER by path; User_id | path -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - 2 | 2 | 0001.0001.0001 0001.0001.0001.0001 1 | 0001.0001.0001.0002 5 8 | | | 0001.0001.0001.0003 6 0001.0002.0001 0001.0003.0001 9 | | 0001.0003.0001.0001 9 0001.0003.0001.0002 | 9 0001.0003.0001.0003 9 | 0001.0003.0002 11 | 2 | 0001.0003.0002.0001 0001.0003.0002.0002 7 | 0001.0003.0002.0002.0001 20 34 | | 0001.0003.0002.0002.0002 31 | | 0001.0003.0002.0002.0003 22 0001.0003.0002.0002.0004 0001.0003.0002.0002.0005 22 5 | | 0001.0003.0002.0002.0006 0001.0003.0002.0003 rows (19)Copy the code
Find all direct childrens for parent ‘0001.0003’, see below:
$SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*{1}' ORDER by path; User_id | path -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - 8 | 0001.0003.0001 9 | 0001.0003.0002 (2 rows)Copy the code
Find all childrens for parent ‘0001.0003’ as follows:
$SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*' ORDER by path; User_id | path -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 6 0001.0003 8 | | 0001.0003.0001 9 | | 0001.0003.0001.0001 9 0001.0003.0001.0002 9 | | 0001.0003.0001.0003 9 0001.0003.0002 11 | 2 | 0001.0003.0002.0001 0001.0003.0002.0002 7 | 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | | 0001.0003.0002.0002.0003 22 0001.0003.0002.0002.0004 34 | 0001.0003.0002.0002.0005 22 | | 0001.0003.0002.0002.0006 5 0001.0003.0002.0003 (15 rows)Copy the code
Find parent for children ‘0001.0003.0002.0002.0005’ :
$SELECT user_id, path FROM comments WHERE path = subpath('0001.0003.0002.0002.0005', 0, -1) ORDER by path; User_id | path -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - 2 | 0001.0003.0002.0002 row (1)Copy the code
If your path is not unique, you will get multiple records.
An overview of the
As you can see, the materialization path using LTREE is very simple. In this article, I have not listed all possible uses of LTREE. It is not considered a full-text search problem ltXtQuery. But you can be in PostgreSQL official documentation (www.postgresql.org/docs/curren)… Find it in.
For more PostgreSQL news and events, please visit the Official PostgreSQL China website:
www.postgresqlchina.com
For more information about PostgreSQL, please visit the Official Chinese PostgreSQL Q&A community:
www.pgfans.cn
For more information about PostgreSQL, please visit the Official PostgreSQL download website in China:
www.postgreshub.cn