This is the 19th day of my participation in the Gwen Challenge in November. Check out the details: The last Gwen Challenge in 2021
Solving concurrency problems
The problem comes when we allow more than one person to rename a file or directory simultaneously. Imagine that you are renaming a directory/Clinton that contains hundreds of files. At the same time, another user to the individual files in the directory/Clinton/projects/elasticsearch/README. TXT rename operation. This user modification action, although it starts after your action, may be completed more quickly.
There are two possible scenarios:
- You decide to use
version
(version) number, in this case, when vsREADME.txt
Your batch rename operation will fail if the file version number conflicts. - You are not using version control and your changes will overwrite other users’ changes.
The problem is that Elasticsearch does not support ACID transactions. Changes to a single document are ACIDic, but changes containing multiple documents are not supported.
If your primary data store is a relational database and Elasticsearch is only used as a search engine or as a way to improve performance, you can first perform changes in the database and then copy those changes to Elasticsearch when you are done. In this way, you will benefit from database ACID transaction support and make changes in the correct order in Elasticsearch. Concurrency is handled in a relational database.
If you don’t use relational storage, these concurrency issues need to be handled at the transaction level of Elasticsearch. Here are three practical solutions for using Elasticsearch, all of which involve locking in some form:
- Global lock
- Document the lock
- Tree lock
Global lock
By allowing only one process to make changes at any one time, we can completely avoid concurrency problems. Most changes involve only a few files and are completed quickly. Renaming a top-level directory can block other changes for a long time, but it may be rare.
Because changes at Elasticsearch document level are supported ACIDic, we can use whether or not a document exists as a global lock. To request a lock, we try to create the global lock document:
PUT /fs/lock/global/_create
{}
Copy the code
If the CREATE request fails due to a conflicting exception, another process has been granted a global lock and we will have to try again later. If the request is successful, we are the proud owner of the global lock and can proceed with our changes. Once done, we must release the lock by deleting the global lock document:
DELETE /fs/lock/global
Copy the code
A global lock can impose significant performance constraints on a system, depending on the frequency and time consumption of changes. We can increase parallelism by making our locks more fine-grained.
Document the lock
Instead of locking the entire file system, we can use the same method techniques described earlier to lock individual documents. We can use scrolled Search to retrieve all documents which will be affected by the changes so a lock file is created for each document:
PUT /fs/lock/_bulk
{ "create": { "_id": 1}}
{ "process_id": 123 }
{ "create": { "_id": 2}}
{ "process_id": 123 }
Copy the code
** | lock The document ID will be the same as the ID of the file that should be locked. |
---|---|
** | process_id Represents the unique ID of the process to perform the change. |
If some files were locked, part of the BULK request would fail and we would have to try again.
Of course, if we try to lock all the files again, the create statement we used earlier will fail because all the files are already locked by us! Instead of a simple create statement, we need an update request with an upsert parameter and this script:
if ( ctx._source.process_id ! = process_id ) { assert false; } ctx.op = 'noop';Copy the code
** | process_id Is a parameter passed to the script. |
---|---|
** | assert false Will throw an exception, causing the update to fail. |
** | willop 从 update Update to thenoop Prevents the update request from making any changes but still returns success. |
The complete update request looks like this:
POST /fs/lock/1/_update { "upsert": { "process_id": 123 }, "script": "if ( ctx._source.process_id ! = process_id ) { assert false }; ctx.op = 'noop';" "params": { "process_id": 123 } }Copy the code
If the document does not exist, the upsert document will be inserted — the same as the previous CREATE request. However, if the file does exist, the script looks at process_id stored on the document. If process_id matches, the update will not execute (NOOP) but the script will return success. If the two do not match, assert False throws an exception and you know that the attempt to acquire the lock failed.
Once all locks have been successfully created, you can proceed with your changes.
After that, you must release all locks. This can be done by retrieving all lock files and deleting them in batches:
POST /fs/_refresh GET /fs/lock/_search? scroll=1m { "sort" : ["_doc"], "query": { "match" : { "process_id" : 123 } } } PUT /fs/lock/_bulk { "delete": { "_id": 1}} { "delete": { "_id": 2}}Copy the code
** | refresh Call to ensure alllock Documents are visible to search requests. |
---|---|
** | You can use this when you need to return a large set of results in a single search requestscroll The query. |
Document-level locking enables fine-grained access control, but creating lock files for millions of documents can be expensive. In some cases, you can achieve fine-grained locking with much less work, as shown in the following directory tree scenario.
Tree lock
In the previous example, instead of locking every document involved, we could lock a portion of the directory tree. We will need exclusive access to the file or directory we want to rename, which can be done by exclusive locking the document:
{ "lock_type": "exclusive" }
Copy the code
Also we need to share lock all parent directories by sharing lock documents:
{
"lock_type": "shared",
"lock_count": 1
}
Copy the code
** | lock_count Record the number of processes holding shared locks. |
---|
For Clinton/projects/elasticsearch/README. TXT rename process need to have an exclusive lock on this file, And in the/Clinton, Clinton/projects and Clinton/projects/elasticsearch has a Shared lock.
A simple CREATE request will satisfy the requirements for an exclusive lock, but shared locks require updates to the script to implement some additional logic:
if (ctx._source.lock_type == 'exclusive') {
assert false;
}
ctx._source.lock_count++
Copy the code
** | iflock_type 是 exclusive (exclusive),assert Statement will throw an exception, causing the update request to fail. |
---|---|
** | Otherwise, we are rightlock_count Do incremental processing. |
This script handles cases where the Lock document already exists, but we also need a upsert document to handle cases where the document does not yet exist. The full update request is as follows:
POST /fs/lock/%2Fclinton/_update
{
"upsert": {
"lock_type": "shared",
"lock_count": 1
},
"script": "if (ctx._source.lock_type == 'exclusive')
{ assert false }; ctx._source.lock_count++"
}
Copy the code
** | The document ID is/clinton , becomes after URL encoding%2fclinton 。 |
---|---|
** | upsert The document will be inserted if it does not exist. |
Once we have successfully obtained a shared lock in all parent directories, we try to create an exclusive lock in the file itself:
PUT /fs/lock/%2Fclinton%2fprojects%2felasticsearch%2fREADME.txt/_create
{ "lock_type": "exclusive" }
Copy the code
Now, if someone else wants to rename the/Clinton directory, they will have to get an exclusive lock on this path:
PUT /fs/lock/%2Fclinton/_create
{ "lock_type": "exclusive" }
Copy the code
This request will fail because a lock document with the same ID already exists. The other user will have to wait for our operation to complete and release our lock. Exclusive locks can only be removed like this:
DELETE /fs/lock/%2Fclinton%2fprojects%2felasticsearch%2fREADME.txt
Copy the code
Shared locks require another script to decrement lock_count, and if the count drops to zero, delete the lock document:
if (--ctx._source.lock_count == 0) {
ctx.op = 'delete'
}
Copy the code
** | Once thelock_count Reaches zero,ctx.op fromupdate Be modified intodelete 。 |
---|
This update request will be executed from bottom to top for each level of parent directory, from longest path to shortest path:
This update request will be executed from bottom to top for each level of parent directory, from longest path to shortest path:
POST /fs/lock/%2Fclinton%2fprojects%2felasticsearch/_update
{
"script": "if (--ctx._source.lock_count == 0) { ctx.op = 'delete' } "
}
Copy the code
Tree locks provide fine-grained concurrency control at minimal cost. Of course, it doesn’t work in all cases — the data model must have a sequential access path similar to a directory tree.
None of these schemes – global, document, or tree locking – addresses the toughest problem of locks: what if the process holding the lock dies?
The unexpected death of a process leaves us with two questions:
- How do we know we can release locks held in death processes?
- How do we clean up unfinished changes from dead processes?
These topics are outside the scope of this book, but if you decide to use locks, you’ll need to give them some thought.
While de-normalization is a good choice for many projects, the need to adopt a locking scheme can introduce complex implementation logic. As an alternative, Elasticsearch provides two models to help us deal with related entities: nested objects and parent-child relationships.