The problem

In business application development, the problem of repeated user submissions is often encountered.

For example, there is a registration form, if the user accidentally clicks the submit button for many times in a row, multiple registration records may be generated in the database; Or after normal submission, due to network or server reasons, the front end does not receive the submission result in time, the user may think that he did not submit successfully, and then submit again or even several times, the database may also generate multiple registration records of the user.

The situation in this example does not have much impact on the business, but it may be more troublesome if the scenario involves the increase or decrease of resources, such as the operation of accounts, inventory, and so on.

The solution, you might say, is to lock the button after the front end commits. It does solve part of the problem. Why part of it? Because the caller may bypass the front-end interface and access the back-end service directly. Well, you might say, just add judgment to the server.

Ok, let’s see how to judge:

In the example above, we assume that the user’s identity is identified by the mobile phone number, so that the server can determine whether to repeat the submission of a SQL query:

select count(*) from table where mobile = 'xxx'
Copy the code

If the number of queries is greater than 0, we consider that the application has been registered, thus interrupting the execution of the program and returning an error.

Let’s look at two more examples:

  • When an employee submits an expense account and accidentally submits it twice, the server may be able to judge by “user Id+ submission time”.

  • In the process of issuing credits to the user, the first request times out and the user tries again. In this case, the server may determine whether the request is repeated by “user Id+ transaction Id”.

There are also many duplicate commit scenarios that can be resolved by adding similar duplicate judgments to the server. But writing these broadly similar business judgment logics every time, software development often says: Don’t repeat yourself?

And adding judgment isn’t a perfect solution. Why? Since back-end services are typically multithreaded, and may even be distributed, it’s not enough to write logic for judgment, but to deal with data consistency, which is a bit more technical.

Is there an easy universal solution?

plan

Problems of induction

Here are the scenarios where the server may receive repeated requests. I summarize them as follows:

  • The front-end control is not strict, and the submission button is not disabled when the user “submits”, resulting in multiple clicks and multiple requests to the server.
  • The caller accesses the service directly and makes multiple service requests to the same business due to a programming error (such as an infinite loop) or an attack (such as a replay attack).
  • Program retry: The retry logic may be used in the front-end or back-end code to automatically retry when some exceptions occur or a timeout occurs, resulting in multiple requests for services.
  • In multi-threaded or distributed environment, repeated judgment is added, but the judgment is invalid due to data consistency problem, and the business is processed repeatedly.

The first two scenarios are easier to understand and will not be explained here. Focus on how the latter two happen.

In this case, the client does not receive the request properly after the first request, and then initiates the second service request after a timeout. In this case, the server performs the same service processing twice.

In a multi-threaded environment, thread 1 accesses the database to query the data, and then decides that it has not committed. Before thread 1 writes to the database, thread 2 also accesses the database to query the data, and then decides that it has not committed the data, so thread 1 and thread 2 both write to the database.

Current limiting scheme

Now to the point, how can limiting flow be applied to solve the problem of duplicate submissions?

Repeated commits satisfy the basic elements of traffic limiting

In terms of limiting traffic, the following basic elements are defined (personal summary) :

  • Traffic limiting has a specific goal, such as limiting IP addresses and users.

  • Current limiting has a time period, such as within 1 second or 1 minute.

  • Traffic limiting has a time period threshold, such as 10 times per second or 100 times per minute.

Returning to the repeated submission, we can conclude from the analysis:

  • Repeated commits can be identified by some data, which can be considered a stream limiting target.

  • Repeat commit naturally has a time latitude corresponding to the time period of limiting.

  • Repeat submission refers to submission after submission, which can be controlled by the threshold of flow limiting and fixed at 1.

Let’s look at two examples:

Register to submit questions repeatedly

  • Traffic limiting target: mobile phone number
  • Traffic limiting period: from the first submission to the end of registration

The mobile phone number can be extracted from the registration information. When the user submits for the first time, the mobile phone number will be used to create a traffic limiting count record. Before the registration, when the user submits again, the traffic limiting count will exceed 1, thus triggering the traffic limiting logic and returning an error to the caller. After the user submits the application again, the service may be shut down, or the front end has closed the entrance, and the back end also has the judgment of the registration deadline. Even if the application can still be submitted, it is meaningless and has no impact on the business.

Employees submit reimbursement issues

  • Traffic limiting target: Employee Id + submission minutes
  • Traffic limiting period: 1 minute after the submission by the user

The employee Id can be extracted from the session, and the submission minutes can be expressed using yyyyMMddHHmm. When the user submits the reimbursement for the first time, a traffic limit count record of “User Id + submission minutes” is created. When the user submits the reimbursement again within 1 minute, the traffic limit count will exceed 1, triggering the traffic limit processing logic and returning an error to the caller. When the user submits the data again one minute later, a new traffic limiting count record is created and the traffic limiting processing logic is not triggered. The data can be submitted normally.

From the above analysis, it is not difficult to see that repeated submissions can meet several basic elements of traffic limiting. Can traffic limiting solve all the problems of repeated submissions?

Stream limiting is used to limit repeated submissions

However, as you may have noticed, there is an implicit assumption that all first business commits are handled correctly. So the limit count of 1 means that it has been committed once. This in the actual operation is difficult to guarantee, because current limit count and business process often is not in a transaction, current-limiting count on some more commonly, so the current limit count may be no problem, but the business process is not successful, such as timeouts, broken network, downtime, such as infrastructure issues, and even business condition does not meet the business logic. So is the restriction going to be bludgeoned to death again?

When faced with a tough problem, I often wonder if it has happened before.

In the network forum more popular era, after Posting or reply, will first enter a number of seconds countdown jump page, after the countdown to the end of the jump to the normal page.

This is a way to limit traffic and prevent repeated submissions. This design to a revelation is: system can be in a very short time, through the way of current limit this kind of low cost, restrict user repeat operations, a normal user may not feel or only slight influence, but to a large extent can avoid duplication to submit data problem, can also block certain malicious behaviour.

With this in mind, let’s take a look at the repeated request scenarios mentioned above:

  • The front-end control is lax, resulting in multiple clicks and multiple requests to the server.

    The server can be used for one user to submit a short time span of the current limiting, such as 5 seconds, normal user fills out a form time should be in more than 5 seconds, if user and submit within 5 s, the front end can prompt the user to return error code according to the service terminal, and jump to submit query results page, the user can see submit the results of their own. If the first submission does not succeed, the user can fill out the submission form again, because it has been more than 5s since the first submission, so the user will not be restricted. Since most commits are supposed to be normal, this is a small probability, but it gives the user a chance to remedy the situation and complain a little.

    There may also be a problem that the server side is too slow to process and the query result cannot be found. To solve this problem, it may be possible to set a timeout time as short as possible on the server side and query several times in the front end. The probability of occurrence is generally not high, and it can also be reduced by technical means.

  • When a user accesses an interface directly, the same service initiates multiple service requests due to program errors or attacks.

    The server can use a short time limit for the same visitor’s submissions, such as once every five seconds. If the limit is triggered, a penalty of the limit will be imposed, which will not be committed for 30 seconds. The penalty can also be exponentially increased. This minimizes the impact of the external program’s abnormal behavior on the service, while the caller can automatically recover from normal handling.

    Timestamps, captcha, SessionId and the like may be defined in some interfaces, and they may also be added to the flow limiting target to accurately identify duplicate commits.

  • Program retry results in repeated submission. If some exceptions occur or a timeout occurs, the system automatically retries, resulting in multiple requests for one service.

    It could be a design problem. Retry operations that initiate commit actions on intermediate services should be avoided because many business processes may not be idempotent, and the retry actions of intermediate services are likely to be ignored because they are invisible to visitors, resulting in data problems. If retries are required, they should be performed only at the originating point of the service. The initiator should be aware of the possible problems caused by the retry logic and minimize the impact.

    Traffic limiting can be introduced in the uppermost service to select the appropriate traffic limiting target, traffic limiting time span and traffic limiting threshold. The internal service is generally considered to be relatively reliable and there is no need to introduce traffic limiting.

  • In multi-threaded or distributed environment, repeated judgment is added, but the judgment is invalid due to data consistency problem, and the business is processed repeatedly.

    By selecting an appropriate traffic limiting target and using a distributed consistent traffic limiting algorithm such as Redis, the submission operation can be executed only once in a certain time range. In this way, the result of repeated judgment can be valid and repeated service processing can be avoided.

Through the analysis of these repeat submit scenario, you can see: current limit is not perfect to avoid repetition and submit, but it can provide a general mechanism greatly reduce duplicate submissions, and the cost of this mechanism can be very low, compared each method hard-code the judgment of the duplicate data, query the database, using distributed lock may cost is much lower. Of course, in order to handle better, it may need some other coordination of the front and back ends.

In fact, the judgment of repeated data can also be increased in business processing, because the front has been blocked by traffic limiting, the chance of repeated execution can be greatly reduced, and the impact of the logic of repeated data judgment will be very low. Especially in some key services, repeated submission may cause a lot of trouble, but at this time may have to solve the query performance of the database, distributed data consistency problems. Choose the lesser of two evils.

implementation

After analyzing the significance of limiting traffic to limit duplicate submissions, it can be applied in appropriate scenarios.

Such as there is a separate system before and after the end, the user through a front-end interface to handle business, users generally only an interface operation, at the same time as much as possible in order to avoid duplication to submit questions, we increase the backend API for users to submit current limiting operation behavior, for each individual user restrictions can only be submitted once within 5 seconds.

Here FireflySoft.RateLimit is used to limit the flow, and the back-end API is based on ASP.NET Core WebAPI.

Install Nuget package

Using the Package Manager console:

Install-Package FireflySoft.RateLimit.AspNetCore
Copy the code

Or use the.NET CLI:

dotnet add package FireflySoft.RateLimit.AspNetCore
Copy the code

Or add it directly to the project file:

<ItemGroup>
<PackageReference Include="FireflySoft.RateLimit.AspNetCore" Version=2. * "" />
</ItemGroup>
Copy the code

Write traffic limiting rules

Register the traffic limiting service in startup. cs and use the traffic limiting middleware.

public void ConfigureServices(IServiceCollection services){... services.AddRateLimit(new InProcessFixedWindowAlgorithm(
        new[] {
            new FixedWindowRule()
            {
                Id = "1",
                ExtractTarget = context =>
                {
                    // The target of stream limiting: the user Id, which is assumed to be passed in from the HTTP Header
                    return (context as HttpContext).Request.GetTypedHeaders().Get<string> ("userId");
                },
                CheckRuleMatching = context =>
                {
                  	// If the current request is a "commit action", the commit action will be processed
                    var path = (context as HttpContext).Request.Path.Value;
                    if(path == "/Comapny/Add"
                      ||path == "/Comapny/Update"
                      ||path == "/Goods/Purchase"
                      ||path == "/Goods/ChangePrice"
                      ||path == "/Order/Pay"
                      ||path == "/Order/Cancel") {return true;
                    }
                    return false;
                },
                Name = "User submission behavior limiting",
                LimitNumber = 1.// Traffic limiting threshold
                StatWindow = TimeSpan.FromSeconds(5), // The time window for limiting traffic is 5 secondsStartTimeType = StartTimeType.FromNaturalPeriodBeign } }) ); . }public void Configure(IApplicationBuilder app, IWebHostEnvironment env){... app.UseRateLimit(); . }Copy the code

This simple code can be used to limit duplicate submissions. Try running.

But if you want to use in a distributed environment, also need to prepare a Redis, will change InProcessFixedWindowAlgorithm RedisFixedWindowAlgorithm, passing a Redis connection object, in addition to many other code is the same.

FireflySoft.RateLimit is an open source. NET Standard is an easy and flexible library to use. You can find the latest code at GitHub or Gitte.


Ok, that’s the main content of this article. What do you have to say about limiting traffic to solve the problem of duplicate submissions?

This article is published by OpenWrite!