Exciting new features in Django 3.2

Django 3.2 is right around the corner, and it’s full of new features. Django versions aren’t usually that exciting (and that’s a good thing!). But this time I found it particularly interesting because I added a lot of functionality to ORM

This is a list of my favorite features in Django 3.2

Image from Django welcome page

A lot of great people worked on this version, but none of them was me. I’ve included ticket links for each new feature to show my appreciation to the people behind it.

Table of contents

Cover the index
Provides the time zone for TruncDate
Building a JSON object
A loud signal receiver
Query set alias
New admin decorator
Value expressions detect types
More features worth mentioning
Wish list

⚙ Set up your local environment with the latest version of Django

To set up an environment with the latest version of Django, you first create a new directory and a virtual environment.

Copy the code

To install the latest version of Django, you can use PIP, or if it hasn’t been released, install it directly from Git.

Copy the code

Start a new project and application.

Copy the code

Add the new application to INSTALLED_APPS and configure a PostgreSQL database.

Copy the code

To try out some new functionality, create a Customer model.

Copy the code

Finally, the DB is created, and the migration is generated and applied.

Copy the code

Very good! Now add some random customer data.

Copy the code

Congratulations to you! You now have 10K new customers. Now that you have 10K new customers, you’re ready!

Cover index

Banks # 30913

Overwriting an index allows you to store additional columns in the index. The main benefit of overwriting an index is that the database can use a pure index scan when the query only uses fields that are present in the index, meaning that the actual table is not accessed at all. This can make queries faster.

Django 3.2 adds support for PostgreSQL overridden indexes.

The new index. include and UniqueConstraint. Include attributes allow the creation of overwritten indexes and overwritten unique constraints on PostgreSQL 11+.

For example, if you want to search for the names of customers that joined during a certain period of time, you can create an index in joined_at and include the field name in the index.

Copy the code

The include parameter makes it an override index.

For queries that use only the joinED_AT and NAME fields, the database will be able to accommodate queries that use only indexes.

Copy the code

The query above found the names of customers who joined before February 2021. Depending on the execution plan, the database can just use the index to satisfy the query without even accessing the table. This is called an “index only scan”.

Index-only scans can be a little confusing at first. As described in the official PostgreSQL documentation, it may take some time before PostgreSQL can actually use indexes _ _ only.

But any table scan in PostgreSQL has an additional requirement: it must verify that each retrieved row pair query’s MVCC snapshot is “visible “[…] . Visibility information is not stored in index items, only in heap items; So at first glance, it seems that each record retrieval requires a heap access.

Another way to check if a table page can be viewed by the current transaction is to check the table’s visibility map, which is much smaller and much faster to access than the table itself. PostgreSQL might take some time to update the visibility map, so until then, you might see an implementation plan like this.

Copy the code

To check if your index really can be used to scan index only, you can speed up the process by manually issuing vacuum analysis on the table.

Copy the code

VACUUM also reclaims some unused space and makes it available for reuse.

The 2020-03-04 update. I originally suggested using VACUUM FULL instead of normal VACUUM. One commenter on Twitter suggested that VACUUM could do the trick with a much less disruptive VACUUM, so use this instead.

It’s also important to remember that inclusive indexing is not free. Additional fields in the index make the index larger.

Provides the time zone for TruncDate

No. 31948 votes

I write a lot about errors in SQL, and time zones are usually at the top of the list. One of the most dangerous mistakes you can make when dealing with timestamps is truncating without explicitly specifying a time zone, which can lead to incorrect and inconsistent results.

In Django 3.2, it’s easier to avoid this kind of error.

TruncDate and TruncTime The new argument to the TruncTime database function, TZinfo, allows truncation of data time in specific time zones.

In previous Versions of Django, time zones were set internally based on the current time zone.

Copy the code

Starting with Django 3.2, you can explicitly provide a time zone for the TruncDate function family.

Copy the code

This is a step in the right direction.

Building a JSON object

Banks # 32179

Building JSON objects in PostgreSQL is handy, especially if you’re dealing with unstructured data.

Starting with Django 3.2, the PostgreSQL function json_build_object that accepts arbitrary key-value pairs has been added to ORM.

Added JSONObject database functions.

An interesting use case is to serialize objects directly in the database, bypassing the need to create ORM objects.

Copy the code

We’ve already shown the importance of serialization performance, so this is something worth considering.

A loud signal receiver

Banks, 32261

Not long ago, I tweeted about a mysterious bug that went unnoticed for a long time because it happened inside a receiver.

When you broadcast a signal using the send_robust, if the signal fails, Django preserves the error and goes to the next receiver. After all the receivers have processed the signal, Django returns a list of the receiver’s return values and exceptions. To check if any recipients have failed, you need to look at the list and check for instances of Exception. Signals are often used to decouple modules, and passing exceptions from the receiver in this way defeats that purpose.

To make sure I don’t miss any more exceptions in the receiver, I create a “loud receiver” to log exceptions.

Copy the code

Starting with Django 3.2, this is no longer necessary.

Signal.send_robust() now logs exceptions.

Very good!

Alias of the query set

Banks # 27719

The alias function is a new feature in Django 3.2.

The new QuerySet.alias() method allows you to create reusable aliases for expressions that do not need to be selected, but can be used for filtering, sorting, or as part of complex expressions.

I often use SubQuery and OuterRef to write complex queries, and there was a small problem when combined with Annotate.

Copy the code

The query above is a complex way to find the first customer to join. This query set uses SubQuery to find the previous customer for each customer through Joined_AT, and then looks for customers that no other customers have previously joined. It’s very inefficient, but I use it to illustrate my point.

To understand this, examine the queries generated by this QuerySet.

Copy the code

Annotated subqueries appear in both the SELECT and WHERE clauses. This affects the execution plan.

Copy the code

Subquery executed twice!

To solve this problem, in Django versions prior to 3.2, you could provide a values_list to exclude annotated subqueries from the SELECT clause.

Copy the code

As an aside: You might think that in this case you could have used.defer(‘id_of_previous_customer’) instead of using values_list and omits the annotated fields. It’s impossible. Django will throw you a KeyError: ‘ID_of_previous_customer’.

Starting with Django 3.2, you can replace Annotate with alias, and this field will not be added to the selection clause.

Copy the code

The SQL generated now uses only one subquery.

Copy the code

Execution plans are easier.

Copy the code

One less way to get yourself caught!

New admin decorator

Banks, 16117

Prior to Django 3.2, to customize a calculated field in Django administration, you first added a function, and then assigned it some properties.

Copy the code

This is one of those weird apis that are mostly only implemented in dynamic languages like Python.

If you’re using Mypy (and you should), this code will trigger an annoying warning, and the only way to silence it is to add a type: ignore.

Copy the code

If you use Django Admin and Mypy as much as I do, this can be quite annoying.

The new Display decorator solves this problem.

The new display() decorator makes it easy to add options to custom display functions that can be used with list_display or readonly_fields. Similarly, the new Action () decorator makes it easy to add options to functions available for action.

Adjust the code to use the new display decorator.

Copy the code

No type errors!

Another useful decorator is Action, which uses a similar approach to customize custom administrative actions.

Value expressions detect types

Banks # 30446

This is a small feature that solves a small problem in ORM.

Now, the Value() expression automatically resolves its output_field to the appropriate Field subclass based on the type of Value it provides, Applies to bool, bytes, float, int, STR, datetime.date, datetime.datetime, datetime.time, datetime.timedelta, Decimal. decimal, and uuID.uuid instances . As a result, the output_field that parses database functions and combined expressions when using Value() may now crash due to mixed types. In this case, you need to explicitly set output_field.

In previous Django versions, if you wanted to use constant values in a query, you had to explicitly set an output_field or it would fail.

Copy the code

In Django 3.2, ORM solves this problem on its own.

Copy the code

Very cool!

More features worth mentioning

There are many more features in Django 3.2 that are better documented than I am. To name just a few.

Navigable links in administration (Ticket #31181). If the target model is registered with the administrator, read-only related fields are now rendered as navigable links. I’m still using the decorator to add links to Django Admin, but I’ll probably use it less now.
Persistence parameter of atomic() (Ticket #32220). When you execute code in a database transaction, when the transaction completes without any errors, you want it to be committed to the database. However, if the caller executes your code in his own database transaction, your code will be rolled back if the parent transaction is rolled back. To prevent this from happening, you can now mark your transactions as durable. A RuntimeError is raised when someone tries to open a persistent transaction within another transaction.
Cached templates are reloaded on Django’s development server (Ticket #25791). If you use Django’s runserver command for native development, you’re probably used to reloading a Python file when it changes. However, if you are using the Django Django. The template. The loaders. Cached. The Loader Loader, when an HTML file changes, development server will not reload it, you must restart the development server to see changes. This is annoying, and so far I’ve had to disable the cache loader in Dev. Starting with Django 3.2, this isn’t necessary, because cached templates can be reloaded correctly during development.
Support for function-based indexes (Ticket #26167). The FBI is useful when you frequently query an expression and you want to index it. A typical example is indexing lower-case text.

Wish list

Django ORM is pretty comprehensive and feature-rich, but there are still a few things on my wish list that need future versions.

Custom connection. Django can currently join only between tables that are joined by a ForeignKey. In some cases, you want to join tables that are not necessarily joined with foreign keys, or use more complex conditions. A common example is a slowly varying dimension where join conditions require a BETWEEN operator.
Update returns. When updating many rows, it is sometimes useful to get them at once. This is a well-known (and very useful) feature in SQL. Django doesn’t currently support this feature, but I hear it will soon.
Database view. There are many hackers who can make database views work with ORM. These methods typically involve creating a view directly in the database or in a manual migration, and then setting managed=False on the model. These little tricks can get the job done, but not in a very elegant way. I wanted a way to define the database view so that the migration could detect changes. You can even choose to use Django’s QuerySet to create a view.
Database partitions. Database partitioning is very useful in data modeling. When used properly, they can make queries faster and maintenance easier. Some database engines, such as Oracle, already provide a very mature implementation for database partitioning, and others, such as PostgreSQL, are gradually being implemented. Currently, there is no native support for database partitioning in Django, and most implementations I’ve seen use manual table management. As a result, I often avoid partitions, which is unfortunate.
Requires default authentication. Django currently allows access to any view unless explicitly noted, usually using the require_login decorator. This makes Django easier to use, but it can lead to security issues if you’re not careful. I know of several solutions, usually using custom middleware and decorators. I really wish Django had an option to reverse this condition so that access is restricted by default unless otherwise flagged.
Typing. If you follow this blog, you know I’m a big fan of type hints in Python. Currently, Django doesn’t provide type hints or official stubs. Shiny new frameworks like Starlette and FastAPI tout themselves as 100 percent type-annotated, but Django still lags behind. A project called Django-Stubs has made some progress in this area.
Database connection pooling Django currently supports two modes for managing database connections — creating a new connection per request, or creating a new connection per thread (persistent connection). In a common deployment, creating a database connection is a relatively heavy operation. It requires setting up a TCP connection, usually a TLS connection, and initializing the connection, which adds a lot of latency. Especially in PostgreSQL, it also consumes a lot of database server resources, so creating a new connection for every request is a bad idea.

Persistent connections are much better. They work well with the usual Django deployment, which is a small number of worker processes and/or threads. But such deployments tend to break down in the real world. Every time your database or one of your upstream starts taking longer to process requests for some reason, workers get tied up, requests flow back and the whole system suffocates. This can still happen even with a strict timeout.

To improve on this catastrophic failure pattern, a common solution is to use synchronous workers, such as GEvent Greenlets, or asyCNIo tasks in the future. But now, each request has its own lightweight thread and therefore its own connection, rendering Django’s persistent connection functionality useless.

It would be nice if Django included a high-quality connection pool that could maintain a certain number of connections and assign them to requests as needed. External solutions like PgBouncer exist, but they add overhead to the operation. A built-in solution is usually sufficient.

Cover index

Provides the time zone for TruncDate

Building a JSON object

A loud signal receiver

Alias of the query set

New admin decorator

Value expressions detect types

More features worth mentioning

Wish list

Related Posts

How does Rocketmq ensure that messages are not lost or reused

Teach you to use a SINGLE SQL to solve cross-database query problems

10 foreign design websites full of dry goods (self-taught design of children’s shoes suggested collection)