Some suggestions for using PostgreSQL

The original address: wiki.postgresql.org/wiki/Don%27…

Translation address: github.com/watermelo/d…

Translator: Khaki Khaki

Translator level is limited, if there is translation or understanding fallacy, please help to point out

1. Database coding

1.1 Do not use SQL_ASCII encoding

1.1.1 why

Although the name may seem to have something to do with ASCII, it’s not. Instead, it simply disallows the use of null bytes.

More importantly, SQL_ASCII means “no conversion” for all encoding conversion functions. That is, whatever the raw byte encoding is. Unless special care is taken, sql_ASCIi-encoded databases may end up storing mixed data with many different encoditions and may not be able to reliably recover the original characters.

1.1.2 When will it be available

If your input data is already encoded mixed data, such as IRC logs or non-MIME-compatible emails, then SQL_ASCII may be the last resort – byTEA encoding should be considered first, or you can detect whether it is encoded UTF8, if not UTF8, For example, WIN1252 encoded data can be assumed to be UTF8 encoded.

2. The tool

2.1 Do not use PSQL -w or PSQL –password

Do not use PSQL -w or PSQL –password

2.1.1 why

If you use the –password or -w flag to connect to the service, PSQL will prompt you for a password – so it will prompt you for a password even if the server doesn’t need one.

This option is unnecessary, as it can give the impression that the server needs a password. If the user you are logging in to does not have a password, or if you enter the wrong password when prompted, you will still successfully log in and assume that this is the correct password – but you cannot use this password to log in from another client (via localhost connection) or as another user.

2.1.2 When can it be used

Do not use.

2.2 Do not use RULE

Do not use a RULE. CREATE RULE defines a new RULE that applies to a particular table or view. If you do, use triggers instead.

2.2.1 why

RULE is powerful, but not easy to understand. It looks like some conditional logic, but it actually rewrites the query or adds other queries to the query.

This implies that all the rules of non-trivial are incorrect. (For a non-trivial definition)

Depesz has more to say on this.

2.2.2 When can it be used

Do not use.

2.3 Do not use table inheritance

Do not use table inheritance; if you do, use foreign keys instead.

2.3.1 why

Table inheritance, in which databases are tightly coupled to object-oriented code, is a fashionable concept. As it turns out, these coupling things don’t actually produce the desired results.

2.3.2 When can it be used

Hardly ever use… About the same. Table partitioning is now done locally, and the common scenario of table inheritance has been replaced by features.

3. The SQL statement

3.1 Do NOT use NOT IN

Do NOT use NOT IN, or any combination of NOT and IN, such as NOT(x IN (select…). ).

(If you think you want NOT IN (select…) Then you should use NOT EXISTS instead.

3.1.1 why

Two reasons:

If NULL is present, NOT IN runs IN unexpected ways:

select * from foo where col not in (1,null);
  -- always returns 0 rows

select * from foo where col not in (select x from bar);
  -- returns 0 rows if any value of bar.x is null
Copy the code

This happens because col IN(1, null) returns TRUE if col = 1, null otherwise (that is, it never returns FALSE). NOT(col IN(1, NULL)) (same as col NOT IN(1, NULL)) cannot return TRUE, That is, the form NOT IN (1, NULL) never returns data.

NOT IN (SELECT…) It doesn’t optimize very well. In particular, the planner (which generates the query plan) cannot convert it to anti-join, so it becomes a hash subplan or plain subplan. Hash subprogramming is fast, but the planner only allows the plan to be used for small result sets; Ordinary subplans are very slow (O(N²) time complexity, in fact). This means that performance looks good in small tests, but slows down by five or more orders of magnitude once the data is large; We don’t want that to happen.

3.1.2 When will it be available

NOT IN (list, of, values, …) This is only a problem if there are null values (parameters or otherwise) in the list. So it can be used to exclude null values.

3.2 Do not use uppercase names

Don’t use NamesLikeThis, use names_like_this.

3.2.1 why

PostgreSQL converts the names of tables, columns, functions, etc., to lowercase unless they are enclosed in “double quotes”.

So create table Foo() will create a table named Foo, and create table “Bar”() will create a table named Bar.

These queries will execute normally: select * from Foo, select * from Foo, select * from “Bar”

These query statements will error “no such table” : select * from “Foo”, select * from Bar, select * from Bar

This means that if you use uppercase letters in table or column names, you must use double quotation marks when querying. This is annoying, but it can be confusing when you access databases using other tools that use double quotes for some names and don’t.

Stick to a-z, 0-9, and underscores for names and don’t worry anymore.

3.2.2 When can it be used

If it’s important to have nice names in your output, you might want to use uppercase letters. But you can also use column aliases, or you can print nice names in the query: select CHARACTER_name as “Character Name” from foo.

3.3 Don’t use BETWEEN (especially for timestamps)

3.3.1 why

BETWEEN uses closed interval comparisons: values at both ends of the range are included in the result.

This is a query question

SELECT * FROM blah WHERE timestampcol BETWEEN '2018-06-01' AND '2018-06-08'
Copy the code

This will include results with a timestamp of exactly 2018-06-08 00:00:00.000000. The query works, but because it is a closed interval, it is possible that the next query will contain the value at that moment (for example, ‘2018-06-08’ AND ‘2018-06-09’ will contain the value at that moment).

Replace with the following statement

SELECT * FROM blah WHERE timestampcol >= '2018-06-01' AND timestampcol < '2018-06-08'
Copy the code

3.3.2 When can it be used

BETWEEN is safe for discrete data such as integers or dates. Remember that BETWEEN is a closed interval. But using BETWEEN can be a bad habit.

4. Date/time storage

(Translator’s Note: Chinese version of date/time)

4.1 Do not use timestamp (without time zone)

Instead of using the timestamp type to store timestamps, use timestamptz (also known as timezone timestamps).

4.4.1 why

Timestamptz records microseconds of UTC. You can insert values for any time zone. By default, it will display the time in the current time zone, but you can switch to another time zone.

Because it stores timestamp information, algorithms can be used to convert timestamps to different time zones.

Timestamp (also known as a timezone free timestamp) does nothing, it just stores the date and time you provide. You can think of it as a picture of a calendar and clock, not a point in time, with no time zone information. Therefore, timestamps without time zones cannot be converted.

So if you want to store a point in time instead of a clock image, use timestamptz.

More about Timestamptz

4.1.2 When will it be available

Timestamps may be appropriate if you treat them in an abstract way, or just for app saving and retrieval without timing them.

4.2 Do not use timestamp (without time zone) to store UTC time

Storing UTC values in a timestamp without a time zone is often the practice of inheriting data from other databases that lack available time zone support.

Instead, use timestamp with time zone (timestamptz).

2 why

Because the database does not know if it is a UTC time zone.

This complicates time calculations. For example, “the last midnight of a given timezone u.timezone” would be calculated as follows:

date_trunc('day', now() AT TIME ZONE u.timezone) AT TIME ZONE u.timezone AT TIME ZONE 'UTC'
Copy the code

And the calculation statement of “midnight before x.datecol date in u.domain” is:

date_trunc('day', x.datecol AT TIME ZONE 'UTC' AT TIME ZONE u.timezone)
  AT TIME ZONE u.timezone AT TIME ZONE 'UTC'
Copy the code

4.2.2 When will it be available

If compatibility with non-time zone supported databases prevails over all other considerations.

4.3 Do not use Timetz

Do not use the timetz type; use timestamptz instead.

4.3.1 why

Even the manual tells you that it is implemented only to comply with the SQL standard.

A time type with a time zone is defined by the SQL standard, but the attributes displayed by the definition raise questions about its availability. In most cases, the combination of date, time, timestamps without time zone, and timestamps with time zone should provide the date/time functionality required by any application.

4.3.2 When can it be used

Never used.

4.4 Do not use CURRENT_TIME

Do not use the CURRENT_TIME function. Using the following is appropriate:

CURRENT_TIMESTAMP or now() if you wanttimestamp with time zone
LOCALTIMESTAMP if you wanttimestamp without time zone
CURRENT_DATE if you wantdate
LOCALTIME if you wanttime

4.4.1 why

It returns a value of type Timetz, as explained above.

4.4.2 When will it be available

Never used.

4.5 Do not use timestamp(0) or timestamptz(0)

Do not use timestamp() or timestamptz() for timestamp conversions (especially 0).

Use date_trunc(‘second’, blah) instead.

4.5.1 why

Because it’s going to round off the decimal part. This can cause unexpected problems; Consider that when you store now() in such a column, you might store a fractional second value in the future.

4.5.2 When can IT be used

Never used.

5. Text storage

5.1 Do not use char(n)

Instead of using char(n), text might be more appropriate.

5.1.1 why

Fields of type char(n) are used, padding the declared length with Spaces if the length is insufficient. This may not be what you want.

The name	describe
character varying(n), varchar(n)	It gets longer. There’s a length constraint
character(n), char(n)	Fixed length, insufficient fill blank
text	Variable length, unlimited length

The manual says:

Values of type CHAR are physically filled with whitespace up to the specified length n, and stored and displayed this way. However, when comparing two values of type CHAR, the trailing whitespace is insignificant and should not be ignored. In collations where whitespace is important, this behavior can result in unexpected results, such as SELECT ‘a ‘::CHAR(2) collate “C” < ‘a\n’::CHAR(2) returns true. When a char value is converted to another string type, the whitespace behind it is removed. Note that in vARCHar and text values, the trailing whitespace is semantic. And when using pattern matching, such as LIKE, use regular expressions.

Padding does waste space and doesn’t make things faster; In fact, in a lot of cases we have to get rid of whitespace.

Tip: There is no performance difference between the three types, except for increased storage space when using the fill blank type, and some extra CPU cycles to check the hold-in length when storing length constrained columns. While char(n) has some performance advantages on some other database systems, it does not in PostgreSQL. In fact, char(n) is usually the slowest of the three because of the extra storage cost. In most cases, text or VARCHar should be used.

5.1.2 When can it be used

When you port very, very old software that uses fixed-width fields. Or when you read the snippet from the manual above and think “Yes, this is perfectly reasonable and meets my requirements”.

5.2 Do not use char(n), even for Fixed-length Identifiers

Sometimes people respond to “why not use char(N)” with “my value is exactly N characters” (such as a country code, hash, or identifier from another system). In fact, using char(n) even in these scenarios is not a good idea.

5.2.1 why

For values that are too short, char(n) fills them with Spaces. Therefore, a char(n) with a definite length has no real benefit over text.

5.2.2 When can IT be used

Never used.

5.3 Do not Use VARCHAR (N) by Default

Do not use the vARCHar (n) type by default. Consider using vARCHar (with no length restrictions) or text instead.

5.3.1 why

Varchar (n) is a text field with length, and will raise an error if you try to insert a string that is longer than n characters (instead of bytes).

Varchar (without (n)) or text is similar, with no length limitation. If you insert the same string between the three field types, they take up exactly the same space, and there is little difference in performance.

Varchar (n) is fine if you want a text field of limited length, but if you define the last name field as varchar(20), So when Hubert Blaine Wolfeschlegelsteinhausenbergerdorff registered to your service, will be an error.

Some databases do not have a long text type, or they are not as well supported as vARCHar (n). Users of these databases will usually use a representation like vARCHar (255), but what they really want is text.

If you need to constrain a value in a field, such as a maximum length constraint – or a minimum length, or a limited set of strings – checking constraints can do that.

5.3.2 When can IT be used

Varchar (n) is a great type if you want a text field, you need to throw an error to insert a string that is too long, and you don’t want to use explicit checking constraints. It just needs to be used with more consideration.

6. Other data types

6.1 Do not use money

The Money data type is actually not very good for storing monetary values. Numbers or whole numbers are probably better.

6.1.1 why

A lot of reasons

The money type stores monetary amounts with fixed decimal precision. Lc_monetary is used to set formatting numbers. But its rounding behavior may not be what you want.

The name	Storage capacity	describe	The scope of
money	8 bytes	Monetary amount	92233720368547758.08 to + 92233720368547758.07

If you change the LC_monetary setting, all the money columns will contain the wrong value. This means that if you insert ‘$10.00’ and lc_monetary is set to en_us.utf-8, you might retrieve a value of ‘10,00 Lei’ or ‘¥1,000’.

6.1.2 When will it be available

If you only work with a single currency, don’t deal with fractional cents and only do addition and subtraction, then the money type is probably correct.

6.2 Do not Use`serial`

For new applications, use Identity.

6.2.1 why

The Serial type has some strange behavior that makes structure, dependencies, and permission management more cumbersome.

6.2.2 When can IT be used

If you need to support PostgreSQL before version 10.0.
In some combinations of table inheritance
More generally, if you use the same sequence from multiple tables in some way, although in these cases explicit declarations may be superiorserialType.

Some suggestions for using PostgreSQL

1. Database coding

1.1 Do not use SQL_ASCII encoding

1.1.1 why

1.1.2 When will it be available

2. The tool

2.1 Do not use PSQL -w or PSQL –password

2.1.1 why

2.1.2 When can it be used

2.2 Do not use RULE

2.2.1 why

2.2.2 When can it be used

2.3 Do not use table inheritance

2.3.1 why

2.3.2 When can it be used

3. The SQL statement

3.1 Do NOT use NOT IN

3.1.1 why

3.1.2 When will it be available

3.2 Do not use uppercase names

3.2.1 why

3.2.2 When can it be used

3.3 Don’t use BETWEEN (especially for timestamps)

3.3.1 why

3.3.2 When can it be used

4. Date/time storage

4.1 Do not use timestamp (without time zone)

4.4.1 why

4.1.2 When will it be available

4.2 Do not use timestamp (without time zone) to store UTC time

2 why

4.2.2 When will it be available

4.3 Do not use Timetz

4.3.1 why

4.3.2 When can it be used

4.4 Do not use CURRENT_TIME

4.4.1 why

4.4.2 When will it be available

4.5 Do not use timestamp(0) or timestamptz(0)

4.5.1 why

4.5.2 When can IT be used

5. Text storage

5.1 Do not use char(n)

5.1.1 why

5.1.2 When can it be used

5.2 Do not use char(n), even for Fixed-length Identifiers

5.2.1 why

5.2.2 When can IT be used

5.3 Do not Use VARCHAR (N) by Default

5.3.1 why

5.3.2 When can IT be used

6. Other data types

6.1 Do not use money

6.1.1 why

6.1.2 When will it be available

6.2 Do not Useserial

6.2.1 why

6.2.2 When can IT be used

Related Posts

Is it hard for Java programmers to jump ship? Mastering the five core elements of the interview, job-hopping ali is not difficult

Use Netty to build a simple HTTP service

Concurrent programming of timed task & timed thread pool principle analysis

6.2 Do not Use`serial`