Can you answer questions about distributed systems in interviews?

How to solve cache avalanche?

Ex ante (before Redis hang) : With the exception of Redis high availability, master/slave + Sentinel or Redis Cluster avoids a total crash

The system also needs to enable caching locally. Ehcache cache of the database. ConcurrentHashMap can be used for a small amount of map structure data.

Then add Hystrix limiting and demoting to prevent mysql from hanging

After redis hangs: Redis persists and restores cached data

How to resolve cache penetration?

Let’s say there are a lot of queries for data that the database doesn’t have,

This condition and the return result, you can set a unkown, as long as the condition does not query the database

How does Dubbo work?

Service registration, registry, consumer, agent communication, load balancing

Ps. Dubbo is divided into 10 layers:

1. The Service layer, the interface layer, is left to us to implement the interface

2. Config layer, configuration layer, any framework needs to provide configuration files

3. Proxy layer, service proxy layer. Dubbo will generate proxies, whether consumers or producers, and carry out network communication between proxies

The provider registers itself as a service, and the consumer can find the registry to call the service

5. At the Cluster layer and cluster layer, providers can be deployed on multiple machines to form clusters

6. Monitor layer, monitoring layer, consumer calls provider, statistical information monitoring

7. The Protocol layer, the remote invocation layer, is responsible for the network communication between the specific provider and consumer calling interface

8. Exchange layer, information exchange layer, encapsulating request response mode, synchronous to asynchronous

9. Transport layer, network transport layer, abstract mina and NetTY as the unified interface

10. Serialize layer

Can I continue communication when the registry is down?

Yes, because when initialized, the consumer pulls the provider’s information into the local cache.

What communication and serialization protocols does Dubbo support?

Protocol + serialization:

1. The default DUBBO protocol, single long connection, NIO asynchronous communication, based on hessian as the serialization protocol, applies to the following scenarios: Small amount of data to be transmitted (less than 100K, large long connection will lead to concurrency reduction), and high concurrency

2. Rmi protocol, Java binary serialization, multiple short connections, suitable for file transfer

3. Hessian protocol (cross-language support, slow serialization speed), multiple short connections, suitable for file transfer

4. HTTP protocol, json serialization

5. Webservice protocol, SOAP text serialization

What load-balancing, high-availability, dynamic proxy policies does Dubbo support?

Load balancing policy:

1. The default call is random and weights can be configured for producers

2. Balance mode

3. Automatic sensing, giving less requests to machines that are inactive (performance checks, machines receiving fewer requests)

4. Consistent hash, where requests with the same parameters are distributed to a machine

Cluster fault tolerance strategy:

1. Automatic switchover after failure (default)

2. If a call fails, the call fails immediately

3. Ignore exceptions

4. If a message fails to be logged in the background, the message must be sent again, which is suitable for writing messages to queues

5. The refrigerator invokes multiple providers and returns if one succeeds

6. Invoke all providers one by one

Dynamic proxy policy:

Javassist dynamic bytecode generation is used by default, proxy classes are created, and you can configure your own dynamic proxy policies through the SPI extension mechanism

What is SPI and how does Dubbo use SPI?

Chinese is service provision interface;

This means that for many components, Dubbo keeps one interface and multiple implementations, and then dynamically finds the corresponding implementation class based on the configuration while the system is running. If not, the default implementation is used

How to do service governance, service degradation and retry based on Dubbo?

1. Service governance (how to manage so many services?) :

A Automatic call link generation: Based on DuBBo, calls between services are automatically recorded, and dependencies and call links between services are automatically generated

B Service access pressure and duration statistics: The system access pressure can be determined based on the number of times the service is accessed and the length of the complete link

C Service layering to avoid cyclic dependency, call link monitoring and alarm, service availability monitoring (success rate)

2. The service is degraded

Mock is to call the corresponding interface after demoting

3. Retry the service

Failed retry + Timeout retry

How to design idempotent distributed service interfaces? (for example, how to ensure no repeated deductions)

In essence, there are three main points to ensure idempotence:

1. There must be a unique identifier for each request;

2. After processing the request, there must be a record to indicate that the request has been processed, such as the database insert pipeline;

3. Check whether a request is processed before receiving it.

So you can do it with a database, you can do it with distributed locks;

How can interface calls be sequenced in distributed systems?

1. As described in the MQ section, put all requests on a queue

2. Distributed locks can be used to ensure 100% sequentiality, but reduce system throughput and concurrency;

How to design a RPC framework like Dubbo?

Consider dubbo’s layering:

1. First of all, the service needs to be registered, which requires a registry, which can be done with ZooKeeper

2. The consumer has to go to a registry to get the service, which exists on multiple machines and is based on a dynamic proxy, which is a local proxy of the interface through which the service address is obtained

3. The load balancing algorithm is required for the machine to be sent

4. How to send it? Netty, NIO; What format data is sent? Hessian Serializes protocol data

5. The same is true on the service side. The dynamic proxy needs to monitor the network port and invoke the corresponding interface after receiving the request.

The above is the simplest RPC framework idea; You can add demotion limiting and things like that;

What are the application scenarios of ZooKeeper?

Most commonly used scenarios:

1. Distributed coordination

System A sends a request to system B through MQ. System A registers a listener on ZK. After system B consumes the listener, system A can receive a notification

2. Distributed lock

3. Configure information management as the Dubbo registry

Zk vs. Redis distributed lock comparison?

1. Performance comparison:

Redis distributed lock, you need to constantly try to obtain the lock, comparing performance consumption

Zk distributed lock, can not obtain the lock, register a listener, performance overhead is small, others can feel the release

2. Comparison of lock release when the client is suspended:

The client from which Redis acquired the lock hung and had to wait for a timeout to release it

Zk: Because it is to create a temporary node (create a node is a lock), the client hangs up the node does not automatically release the lock

Therefore, I think zK distributed lock is more reliable than Redis distributed lock, and the model is simple and easy to use

I use redis distributed locks in my actual projects.

Redis distributed lock, officially called RedLock algorithm, has three important considerations: mutual exclusion, no deadlock, fault tolerance (most nodes or this lock can be added and released)

Redis distributed lock the most basic implementation:

Create a key lock in Redis
SET key value (random value) NX PX 30000 //NX means that the key does not exist before the setting succeeds. PX 30000 means that the key will be released automatically after 30 seconds
Release is to delete key, use lua script to delete, determine the same value to delete

Why do I use random values in setNx?

Because the operation time is too long, more than 30 seconds, I am ready to delete the lock after operation, actually the lock timeout release, if you do not add a random value, it may kill the lock added by others

RedLock algorithm

Suppose there are five master instances in the Redis cluster

1. Obtain the current timestamp, in milliseconds

2. As with the basic implementation, try to create locks on each master node in turn, with a short expiration time, usually tens of milliseconds

3. Try to create a lock on most nodes, say 3 out of 5 (why most? In order to avoid the risk of downtime, 5 downtime and 2 downtime can still be successful)

4. The client calculates the lock establishment time. If the time is shorter than the timeout period, the client succeeds

5. If the lock fails to be created, delete the lock

6. When someone else does, you keep polling to try and get the lock

How is distributed Session implemented?

Common schemes:

1. Tomcat + Redis (also with sentry)

Based on tomcat native session support, use TomcatRedisSessionManager, to deploy tomcat is to save the session as redis;

The downside is coupling to the Web container, so what if you want to migrate to Jetty? It’s all reconfigured on one side, so it’s a hassle;

2.spring session + redis

Writing directly to Redis via Spring Session requires no interaction with the container.

Wenyuan network, only for the use of learning, if there is infringement please contact delete.

I’ve compiled the interview questions and answers in PDF files, as well as a set of learning materials covering, but not limited to, the Java Virtual Machine, the Spring framework, Java threads, data structures, design patterns and more.

Follow the public account “Java Circle” for information, as well as quality articles delivered daily.