This is why’s 76th original article

Start with an interview question

Some time ago, a friend with two years’ experience in Shenzhen went out for an interview, and collected several offers from big factories. At the same time, he also summarized the interview questions encountered in the interview process. There are many interview questions, and I will share them with you at the end of this article.

This article mainly shares a scene problem he met in the interview process:


He said he had no idea about this scenario during the interview.

Seriously, request merge I know that high concurrency is nothing more than a quick request merge.

However, in my limited cognition, if the scenario of high concurrent deduction inventory similar to seckill is done in the way of request merging, I personally feel that it is a little strange and not traditional enough.

In traditional, or industry-wide, seckill solutions, you won’t find merge requests from the front end to the back end.

I understand that the more appropriate scenario for request consolidation is the query class, or the requirement of the numerical increment class, for inventory deduction, you can be oversold if you are not careful.

Of course, there is also the possibility that I misunderstood the topic, see the high concurrent deduction inventory thought of the second kill scenario.

But it doesn’t matter, and we can’t directly confront the interviewer.


I’m going to go back to what I think is a reasonable scenario, and show you what I understand about request merge and high concurrency request merge.

Request to merge

Now let’s get rid of the seckilling scenario.

Let’s talk about a request merge in a more appropriate and perhaps more understandable context.

It’s a hot account.

What is a hot account?

In a third-party payment system or a transaction institution such as a bank, every transfer or transfer transaction is generated, the account involved in the transaction needs to be recorded.

Bookkeeping generally involves two parts.

  • The trading system records information about this transaction.
  • The account system needs to increase or decrease the corresponding account balance.

If the operation on an account is very frequent, then when we operate on the account balance, there will be a problem of concurrent processing.

What if it’s concurrent?

Yes, we can lock accounts. As a result, the account is involved in frequent locking and unlocking operations.

This way we can keep the data intact, but the problem is that as concurrency increases, account system performance degrades.

This account is a hotspot account and a performance bottleneck.

Hot accounts are a very common problem in the industry.

The general solutions I’ve seen fall into three categories:

  • Asynchronous buffered bookkeeping.
  • Set up a shadow account.
  • Multiple entries in one.

This section mainly introduces the solution of “multiple accounts in one”, thus eliciting the probability of request merging.

For the other two solutions, just a quick word.

First, buffer the bookkeeping asynchronously.

Without explaining, you look at the name, you think about the scene, and what do you think you’ll come up with?

Asynchronous, does MQ come to mind?

So why did you introduce MQ into your system?

Asynchronous processing, system decoupling, peak cutting and valley filling.

Which scenario would you say we’re in right now?

It must be to cut the peak and fill the valley.

Assume that the TPS of the accounting system is 200 transactions per second. When requests fall below 200 transactions per second, the accounting service can basically process and return immediately.

From the user’s point of view: snap, quick. I was notified that the charge was successful, and I saw that the balance had changed.

However, at the peak of business, the traffic directly doubles and 400 requests come in every second. At this time, it is the peak of traffic for the accounting system, and it needs to cut the peak. Requests begin to accumulate in the queue and queue up for processing.

In the traffic trough, you can complete this part of the data consumption.

After the data is thrown into the queue, the user can be told that the billing is successful and the money is coming.

But the problem with this solution is also obvious, if the traffic really burst, there is no valley for you to fill a day, the queue is piled up with a large number of requests have not had time to process, what do you do?

This is to the user: you clearly told me that the bookkeeping success, why my account balance has not changed? Do you want to steal my money, MY backhand is a wave of complaints.

Another risk is that if the request for expenses is clipped, obviously we tell the user that the operation is successful in advance, but the actual movement of the account balance has been delayed, so the account may be overdrawn.

Another scheme to set up shadow accounts is actually in a different direction from the subject of our request for merging this time.

The idea is to split.

Hot account in the final analysis is a single point of problem, so for the single point of problem, we use the idea of micro service to solve the words of what is the solution?

It’s split up.

Let’s say I have 100 dollars in this hot account, and I set up 10 shadow accounts, and each account has 10 dollars, does that split our traffic? It went from one account to 10.

The stress is shared.

This solution is somewhat similar to the inventory in the second kill scenario, we can also split the inventory.

But the problems are also obvious.

One is to obtain the account balance of the need to carry out summary operation.

Second, suppose the user wants to deduct 11W? Our total balance is enough, but the money in each shadow account is not enough.

Third is your shadow account selection algorithm is very important, is it using random? Training in rotation? Weighted? These have a greater impact on the success rate of accounting.

Another idea THAT I mentioned in a previous post is to look at its application in JDK source code: I looked at the high-concurrency secret in LongAdder, and it just says two words…

Ok, back to the theme of this time: multiple accounts in one.

There is an online celebrity shop, business is very good, many people in the shop every day.

When the user scans the code to pay, the request is sent to the third-party payment company connected to the store.

When the payment company receives the request and completes the billing operation, the merchant will be notified of the successful payment of the user. You can now give the user goods.


As the store’s business is getting better and better, the problem is that the system pressure of the third-party payment company is increasing and it can no longer handle such a large number of concurrent transactions. As a result, the success rate of user payment decreases or it takes a long time for users to notify merchants after successful payment.

Then for this merchant’s account, we can do multiple transactions in one.

When the record is entered into the buffer flow record table, we can inform the merchant that the user has paid successfully. As for the money, you can rest assured that I have a scheduled task and it will arrive in the account soon:


So when a user places an order, we just record the data first, and don’t actually touch the account. Waiting for the scheduled task to trigger the bookkeeping, to merge multiple transactions.

Take this diagram below:


The merchant actually has 5 user payment records, but these 5 records correspond to an account flow. We have the account records, which can be traced back to these five transactions.

The benefits are improved throughput, timely notification, and user experience. The downside is that the balance is not an exact value.

Assuming that our timed task is to summarize once an hour, the transaction amount that the merchant sees in the back end may be the data from one hour ago.

In addition, this scheme is very suitable for the scenario of receiving money from the account, but there may be a negative amount in the scenario of reducing money.

I don’t know if you see the secret of the multi-stroke solution.

If we think of the buffer flow record as a queue. So the scheme abstracts out as a queue plus a scheduled task.

Therefore, the key point of request merge is also queue plus scheduled task.

By now, we should have a rough idea of what a request merge is, and it does have a real application scenario.

In addition to my above example, there are mGET in Redis, batch insert in database, this thing is not a real scenario request merge?

For example, Redis merges multiple GETS and calls mGET. Multiple requests are combined into one request, saving network transmission time.

And real case is a transfer scenario, some transfer is pay-per-view channel, so as a third party company, we can put the user’s request, first in the list record, after an hour, with summary, suppose this happened ten times in one hour transfer, then ten times charge becomes the one, Although let the customer wait a little longer, but still acceptable range, this operation saves is real money and silver.

High concurrency request merge

Now that we understand the merge request, let’s talk about what happens when he adds the words “high concurrency” in front of it.

First of all, no matter how much fancy adjectives and rhetoric are attached to the request for merger, it is still a request for merger.

The infrastructure of queues and scheduled tasks will certainly remain the same.

In the case of high concurrency, the amount of requests is very large, so we can adjust the frequency of scheduled tasks to a little higher, ok?

I used to get 50 requests within 100ms, and every time I received one, I processed it immediately.

Now we cache the request in a queue and perform a scheduled task every 100ms.

After 100ms, there will be a scheduled task to take all requests within 100ms and process them in a unified manner.

At the same time, we can also control the length of the queue, for example, as long as 50ms queue length reaches 50, at this time I also merge processing. No need to wait until 100ms later.

In fact, at this point, the answer to high concurrency request merging is already there. There are three key points:

One is to use the queue and scheduled task to achieve.

Second, control the execution time of scheduled tasks.

Third, the task length of the buffer queue is controlled.

The scheme is thought of, the code is not very easy to write things. And the scenario diagrams for this kind of interview tend to talk about technical solutions rather than specific code.

When it comes to discussing specific code, either you have doubts about your solution and want to discuss the feasibility of implementation in detail. Either you got it right, and he’s going to have to start with another interview question about trading code.

In short, most of the time, you won’t have to talk to an interviewer about the details of the code after you’ve given him something he thinks is wrong. You’re not on the same channel anymore, so let’s change the subject.

If it comes to code implementation, chances are he’s waiting for you to name a framework: Hystrix.

Hystrix framework

This is a question that, if you know Hystrix, you can easily give a perfect answer to.

Because Hystrix has the ability to request merges. I’ll show you.

Suppose we have a student information query interface that calls very frequently. For this interface we need to do request merge.

To do request merge, we have at least two interfaces, one is to receive a single request, and one is to process the request after the sum of the single request.

So we need to provide two services first:


The Controller corresponding to the interface queried according to the specified ID is as follows:


Once the service started, we used the thread pool and CountDownLatch to simulate 20 concurrent requests:


As you can see from the console, 20 requests were received and 20 queries were executed:


Obviously, at this point we can do request merge. Every 10 requests received are consolidated into one processing. This is what happens with The Hystrix code. For code brevity, I use comments:


In the image above, there are two methods, one called getUserId, which returns NULL because the method body is not important and will not be executed.

As you can see in @hystrixcollapser there is a batchMethod property with the value getUserBatchById.

That is, the corresponding batch method for this method is getUserBatchById. When we request the getUserById method, Hystrix, through some logic, forwards it to getUserBatchById for us.

So we call the getUserById method again:


Similarly, we simulated 20 concurrent requests using a thread pool in conjunction with CountDownLatch, but changed the request address:


After the call, something magical happens. Let’s look at the log:


The same 20 requests were received, but only two SQL statements were executed for every 10 batches.

From 20 SQL to 2 SQL, this is the power of request merge. Requests are even processed faster than individual requests, which is also a performance boost.

What if we only have 5 requests, instead of 10?

Don’t forget, we have a scheduled mission.

In Hystrix, scheduled tasks are executed every 10ms by default:


Also, we can see that if maxRequestsInBatch is not set, the default is integer.max_value.

In other words, when you do request merging in Hystrix, it’s much more time oriented.

Functional demonstration, in fact, so simple, the amount of code is not much, interested friends can directly build a Demo run to see. Look at the source code for Hystrix.

I’m just going to give you a few key points here.

The first one must be that we need to find a way in.

As you can see, the body of our getUserById method is simply return NULL, which means it doesn’t matter what the body is, because the code in the body is not executed. It simply intercepts the method entry, caches it, and forwards it to the batch method.

Then there is an @hystrixCollapser annotation on top of the method body.

What can you think of as the corresponding implementation?

Must be AOP.

So, we take the full path of the annotation, search for it, and poof, pretty soon, we find the method entry:

com.netflix.hystrix.contrib.javanica.aop.aspectj.HystrixCommandAspect#methodsAnnotatedWithHystrixCommand


To start debugging, press a breakpoint at the entry point:


Second, let’s look at where scheduled tasks are registered.

This is easy to find. We already know that the default parameter is 10ms, so we just need to follow the link to see where the code calls the corresponding GET method:


At the same time, we can see that the timing function is based on Java. The util. Concurrent. ScheduledThreadPoolExecutor# scheduleAtFixedRate.

When the specified number of tasks is exceeded, the user does not wait for the scheduled task to execute, but directly initiates the summary operation:


As you can see, at com.net flix. Hystrix. Collapser. RequestBatch# offer method, when argumentMap size greater than we specified maxBatchSize returns null.

If null is returned, the request cannot be accepted and needs to be processed immediately. The comments in the code make it clear:


These are the three key points, Hystrix source code to read, need to work hard, you need to be prepared for your own research.

Finally, post an official request merge workflow flowchart:


Call it a day.

The interview questions

The previous shenzhen, two years of experience of the small partner put together a summary of the interview questions to me, I also share with you.

Java based

  • Underlying principles of the volatile keyword
  • Description of thread pool parameters
  • Lock and synchronized
  • ReentrantLock Lock fair and unfair implementation, reentrant principle
  • HashMap capacity expansion timing (whether capacity expansion is triggered when capacity is initialized to 1000 and 10000), mechanism, differences between 1.7 and 1.8
  • ConcurrentHashMap1.7, 1.8 optimization and difference, size method to achieve the difference
  • ThreadLocal: Why memory Leaks
  • The purpose and difference of blocking queues
  • LinkedBlockingQueue Add and put differences for columns, how are they used in practice
  • Pessimistic lock, optimistic lock, spin lock use scenario, implementation, advantages and disadvantages
  • Class. ForName loadClass;
  • Thread life cycle, deadlock conditions and deadlock avoidance, state transition relationships (source code level);
  • String Intern method;
  • Advantages and disadvantages of CAS and solutions, ABA problems;

The JVM related

  • Fragmentation solutions for CMS garbage collection
  • Common garbage collector
  • Advantages and disadvantages of JVM garbage collector CMS, differences from G1, timing into the old age
  • JVM memory model
  • JVM tuning ideas
  • The GC Root, ModUnionTable
  • Bias lock, lightweight lock, heavyweight lock underlying principle, upgrade process
  • Jmap, jstat, TOP, MAT
  • CMS is distinct from G1
  • GC Root, ModUnionTable;

Redis related

  • Redis high performance reasons
  • Deployment mode of Redis
  • Underlying principles of RedisCluster
  • Redis persistence mechanism
  • Cache flushing mechanism
  • Scenarios and solutions of cache penetration, cache avalanche, and cache breakdown

SQL related

  • MyBatis interceptor uses
  • MyBatis dynamic SQL principle
  • Sub-database sub-table scheme design
  • MySQL: How to solve magic reading (source level)
  • The scope principle of Gap lock
  • RR and RC
  • MySQL default transaction isolation level Oracle default transaction isolation level
  • MySQL > use B+ tree index
  • What properties of ACID are guaranteed by the redo log, binlog, and undo log write sequences
  • Optimistic database locking
  • MySQL optimization
  • Basic principles of MySQL

Spring related

  • The @bean annotation differs from the @Component annotation
  • Spring Aop principle
  • @Aspect is different from normal AOP
  • Custom interceptors and Aop are executed first
  • Web blocker
  • DispatchServlet principle

Dubbo related

  • Dubbo load balancing and cluster fault tolerance
  • Dubbo SPI mechanism and Route rewriting application scenarios
  • Underlying principles of Dubbo RPC
  • Implementation principle of full-link monitoring

Distributed correlation

  • The implementation of distributed lock
  • Funnel algorithm, token bucket algorithm
  • Transaction final consistency solution
  • SLA
  • Distributed transaction implementation methods and differences
  • What if Tcc Confirm fails?
  • Distributed lock of various implementation methods, comparison
  • Various implementation methods and comparison of distributed ID
  • Clock callback problem and solution of Snowflake algorithm
  • Red lock algorithm

Design patterns

  • Common design patterns
  • The state pattern
  • What problems does the chain of responsibility model solve
  • Hungry han style, lazy style advantages and disadvantages, use scenarios
  • Template method pattern, Policy pattern, singleton pattern, responsibility chain pattern

Zookeeper

  • Basic architecture design of Zookeeper
  • Zk consistency

MQ

  • Kafka sequential messages
  • MQ messages are idempotent
  • Kafka high performance secrets
  • Kafka high throughput principle
  • Rocket transaction messages, delayed queues

Computer network

  • What happens when the browser enters a URL
  • Http 1.0, 1.1, 2.0 differences
  • IO multiplexing
  • TCP four-wave process and status switch
  • XSS, CRSF attack and prevention
  • 301, 302 difference

Tomcat

  • General principles of Tomcat

code

  • Handwritten publish and subscribe model
  • Add large numbers (two strings)

Scene problem

  • The reward list is realized
  • Request merging under high concurrency
  • CPU 100% processing experience
  • Short chain system design
  • Nearby people project realization
  • 10W red packets are sent in seconds
  • The implementation scheme of delayed task is compared with its advantages and disadvantages

Ashamed to say, I can’t answer some questions, so let’s check and fill in the gaps together.

Oh, by the way, that guy finally got several offers from big factories and came to me to ask which one was good.

You say that this question to me is not supernormal? I didn’t experience it in Dachang either. So I suspect he will cheat and sneak on me, an honest trumpet master, and I hope he can make a good development in the goose factory:


Drought cavity be out of tune


When I woke up on Saturday morning, I saw the news that it was snowing in Beijing. Chengdu has also cooled recently. The picture above was taken when I was in Beijing.

I just missed the distinctive autumn in Beijing, and ushered in my favorite winter.

I miss the northern winter, the feeling of coming in and having a thin mist over my glasses and then being wrapped in warm air.

In Beijing, entering the house in winter is to take off the thick cotton-padded clothes first, while in Chengdu, entering the house in winter is to wrap the clothes on the body subconsciously first.

Of course, as a southerner, my favorite time is when it snows. In the urban area of Chengdu, it is very rare to encounter snow days, occasionally encountered with a few snowflakes, falling on the ground is definitely not snow formation.

While in Beijing, the weather forecast is that it will snow in the evening, the next morning are full of expectations to open the curtain, eager to see the snow covered Beijing.

I like the kind of wearing big boots, stepping on the snow soft, give out ji Ji ji ji, that is a kind of northern voice.

It is a pity that every time it snows in Beijing, it is not the right time, so I have no time to go to the Forbidden City. When it snows, go to the Forbidden City, may also be countless people can meet but can not ask for things.

I was going to describe how much I miss the snow in Beijing when I wrote this, but the guy who installed the projector called to say he was already waiting for the elevator.

Then I have to wrap it up quickly.

I have been busy with furniture and soft decoration this weekend, and I am a little tired. Thinking that the year would soon be over.

I plan to write every year “my Year” has not started to write, a thought will not insist on writing for 7 years, will not be really cut off this year?

A wave of anxiety followed. So what can you do? Take a break or go back to work. That’s okay. Anxiety is mostly about looking too far.

That’s okay. Just walk the road you’re on.

Oh, by the way, the new house doesn’t have Internet yet, and I spent two afternoons there over the weekend, so I wrote this article on my phone’s hot spot. More, is my last stubborn.

All right, shifu’s knocking.

That’s it.

If you find something wrong, you can mention it and I will correct it.

Thank you for reading, I insist on original, very welcome and thank you for your attention.

I am Why, a literary creator delayed by code. I am not a big shot, but I like sharing. I am a nice man in Sichuan who is warm and interesting.

And welcome to follow me.