preface

Hello, everyone. I am a little boy who picks up field snails. Recently, a friend of the programming discussion group went to ant Financial for an interview. The following is the interview question. Let’s discuss how to answer it.

  • Public account: pick up the little boy snail
  • Making the address

1. Are distributed transactions used? Why do we use this scheme? Are there other schemes?

What is a distributed transaction

When we think of transactions, we think of database transactions. Atomicity, consistency, persistence, isolation are easy to think of.

A distributed transaction is a little different from a database transaction in that the participants of the transaction, the server supporting the transaction, the resource server, and the transaction manager are located on different nodes in different distributed systems. Simply put, distributed transaction refers to the transaction in distributed system, which exists to ensure the data consistency of different database nodes.

Distributed Transaction Fundamentals

Distributed transactions require knowledge of CAP theory and BASE theory.

Theory of CAP

  • C: Consistency: Indicates whether data is consistent across multiple copies. For example, after a partition node updates data, the data read by other partition nodes is also the updated data.
  • A: Availability: Availability means that the services provided by the system must always be available. For each operation request of the user, the result can always be returned within A limited time. The emphasis here is on “in limited time” and “return result”.
  • Partition tolerance (P:Partition tolerance) : When a distributed system encounters any network Partition failure, it still needs to provide services that meet the requirements of consistency and availability.

In a distributed system, CAP theory can only satisfy two points (consistency, availability, partition fault tolerance) simultaneously.

The BASE theory of

BASE theory, which is an extension of AP in CAP, considers sacrificing consistency for system availability and partition fault tolerance for our business systems. BASE is an abbreviation for Basically Available, Soft state, and Eventually consistent.

  • Basically usable by supporting local failures rather than global failures of the system;
  • Soft State indicates that the State can be out of sync for a period of time.
  • The final consistency, the final data is consistent is ok, rather than real-time strong consistency.

Several solutions for distributed transactions

  • 2PC(two-phase commit) scheme, the transaction commit is divided into two phases: preparation phase and commit execution scheme.
  • TCC (Try, Confirm, Cancel), it uses compensation mechanism, the core idea is: for each operation, to register a corresponding confirmation and compensation (Cancel) operation.
  • The core idea of local message table is to split distributed transactions into local transactions for processing.
  • Maximum effort notification, to achieve maximum effort notification, the ACK mechanism of MQ can be adopted.
  • Saga transactions, the core idea is to split a long transaction into multiple local short transactions, which are coordinated by the Saga transaction coordinator. If it ends normally, it completes normally, and if a step fails, compensation operations are invoked once in reverse order.

At present, the industry is using the local message table this scheme is more, its core idea is to split the distributed transaction cost transaction processing. Take a look at the basic implementation process diagram:

For the message sender:

  • First, you need a message table that records information about message status.
  • The business data and the message table are in the same database, that is, ensure that they are both in the same local transaction.
  • After processing the business data and writing the message table in the local transaction, it is written to the MQ message queue.
  • The message is sent to the message consumer, and if it fails, it is retried.

Message consumer:

  • Process the messages in the message queue and complete its own business logic.
  • At this point, if the local transaction was successful, the processing was successful.
  • If the local transaction fails, execution is retried.
  • If the fault is caused by a service failure, a service compensation message is sent to the message producer to notify the operator of the rollback.

The producers and consumers scan the local message table regularly and send the incomplete or failed messages again. If there is a sound automatic reconciliation and reimbursement logic, this scheme is very practical.

2. What new features are provided by JDK6, 7, and 8 respectively

New features in JDK 6

  • Desktop class (which allows a Java application to start another application locally to process URI or file requests)
  • Use JAXB2 to map objects to XML
  • Lightweight Http Server API
  • Plug-in annotation processing API (The Lombok framework is based on this feature)
  • STAX (an API for processing XML documents in JDK6)

New features in JDK 7

  • The switch supports String
  • Try-with-resources: the resource is automatically shut down
  • Integer types such as (byte, short, int, long) can be represented in binary
  • Numeric constants support underscores
  • Automatic generalization of generic instantiation types, i.e. “<>”
  • A catch catch multiple exception types, with the (|)
  • Enhanced file systems
  • Fork/join framework

New features of JDK8

  • Lambada expression
  • Functional interface
  • Method references
  • The default method
  • Stream API
  • Optional
  • Date Time API (e.g. LocalDate)
  • Repeated notes
  • Base64
  • New features of the JVM (such as Metaspace instead of persistent generations)

3. HTTPS principle and workflow

  • HTTPS = HTTP + SSL/TLS, that is, SSL/TLS for encryption and decryption of data, HTTP for transmission.
  • SSL, or Secure Sockets Layer protocol, is a security protocol that provides security and data integrity for network communication.
  • TLS, or Transport Layer Security, is a later version of SSL3.0.

  1. The client initiates an Https request to connect to port 443 of the server.
  2. The server must have a set of digital certificates (including the public key, certificate authority, and expiration date).
  3. The server sends its own digital certificate to the client (the public key is in the certificate, and the private key is held by the server).
  4. After receiving the digital certificate, the client verifies the validity of the certificate. If the certificate is authenticated, a random symmetric key is generated and encrypted using the certificate’s public key.
  5. The client sends the encrypted key to the server.
  6. After the server receives the ciphertext key sent by the client, it uses its private key to decrypt it asymmetrically. After decrypting, it gets the client’s key, and then uses the client’s key to symmetric encrypt the returned data. The transmitted data is ciphertext.
  7. The server returns the encrypted ciphertext to the client.
  8. After receiving the data, the client uses its own key to decrypt it symmetrically and get the data returned by the server.

4. How to implement Java JMM Volatile

The volatile keyword is the lightest synchronization mechanism provided by the Java Virtual Machine. It is used as a modifier to modify variables. It guarantees visibility of variables to all threads and disallows instruction reordering, but does not guarantee atomicity.

How does volatile guarantee visibility? Let’s take a look at the Java Memory Model (JMM)

  • The Java Virtual Machine specification seeks to define a Java memory model to mask memory access differences across hardware and operating systems in order to achieve consistent memory access for Java programs across platforms.
  • For better performance, the Java memory model does not restrict the execution engine from using specific registers or caches of the processor to work with main memory, nor does it restrict the compiler from tuning code sequence optimizations. So the Java memory model has cache consistency problems and instruction reorder problems.
  • The Java memory model states that all variables are stored in main memory, and that each thread has its own working memory. Variables include instance variables and static variables, but not local variables, which are thread private.
  • A thread’s working memory holds a main memory copy of the variables used by the thread. All operations on variables must be performed in the working memory, not directly in main memory. And each thread cannot access the working memory of another thread.

Volatile variables guarantee that new values are synchronized back to main memory immediately and are flushed from main memory immediately before each use, so we say that volatile guarantees the visibility of variables used by multiple threads.

Instruction reordering is when the compiler and CPU may reorder instructions during program execution in order to improve performance. How does volatile prevent reordering? In the Java language, there is a happens-before principle.

  • Rule of program order: within a thread, in the order of the flow of control, the first written action occurs before the next written action.
  • Pipe locking rule: an unLock operation occurs first and then a lock operation on the same lock
  • The volatile variable rule: a write to a variable occurs before a read to the variable
  • Thread start rule: The start() method of a Thread object precedes every action of the Thread
  • Thread termination rule: All operations in a Thread occur before the Thread terminates. We can detect that the Thread has terminated by the return value of thread.join () and thread.isalive ()
  • The thread interrupt rule: The call to the threadinterrupt () method occurs first when the interrupted thread’s code detects that an interrupt event has occurred
  • Object finalization rule: The completion of an object’s initialization precedes the beginning of its Finalize () method
  • Transitivity: If operation A precedes operation B, and operation B precedes operation C, then operation A precedes operation C

In fact, volatile guarantees visibility and disallows instruction reordering both in relation to memory barriers. Let’s look at some demo code for volatile

public class Singleton { private volatile static Singleton instance; private Singleton (){} public static Singleton getInstance() { if (instance == null) { synchronized (Singleton.class) { if (instance == null) { instance = new Singleton(); } } } return instance; }}Copy the code

< p style = “font-size: 10.5pt; line-height: 10.5pt;”

The lock directive acts as a memory barrier and guarantees the following:

  • 1. The following instructions cannot be reordered to the position before the memory barrier
  • 2. Write the cache of the processor to the memory
  • 3. If the write action is performed, the corresponding cache in other processors is invalid.

Points 2 and 3 are an indication that volatile guarantees visibility, and point 1 is an indication that reordering instructions is prohibited. What is the memory barrier?

Memory barriers are classified into four categories :(Load represents read instructions, Store represents write instructions)

Memory barrier types Abstract the scene describe
LoadLoad barrier Load1; LoadLoad; Load2 Before the data to be read by Load2 is accessed, ensure that the data to be read by Load1 is finished.
StoreStore barrier Store1; StoreStore; Store2 Before Store2 writes are executed, ensure that Store1 writes are visible to other processors
LoadStore barrier Load1; LoadStore; Store2 Before Store2 is written, ensure that the data to be read by Load1 is read.
StoreLoad barrier Store1; StoreLoad; Load2 Before a Load2 read is performed, ensure that the write to Store1 is visible to all processors.

To implement the memory semantics of volatile, the Java memory model takes the following conservative approach

  • Insert a StoreStore barrier before each volatile write.
  • Insert a StoreLoad barrier after each volatile write.
  • Insert a LoadLoad barrier after each volatile read.
  • Insert a LoadStore barrier after each volatile read operation.

Now, for those of you who are a little confused about this, the memory barrier thing is so abstract. Let’s look at the code:

The memory barrier ensures that the previous instruction is executed first, so this prevents reordering instructions, and also invalidates cache writes to memory and other processor caches, thus ensuring visibility

5. Let’s talk about the 7-layer network model. Why does TCP require a three-way handshake

The computer network architecture has three layers: OSI seven-layer model, TCP/IP four-layer model, and five-layer architecture, as shown in the figure below:

The seven-layer model, also known as the Open System Interconnection (OSI), is a standard System established by the International Organization for Standardization for the Interconnection of computers and communication systems.

  • Application layer: an interface between network services and end users. Common protocols are HTTP, FTP, SMTP, SNMP, DNS.
  • Presentation layer: data representation, security, compression. To ensure that information sent by the application layer of one system can be read by the application layer of another system.
  • Session layer: Establishes, manages, and terminates sessions that correspond to host processes and refer to ongoing sessions between a local host and a remote host.
  • Transport layer: defines the port number of the protocol that transmits data, as well as flow control and error verification. Protocols include TCP and UDP.
  • Network layer: logical address, realize the path selection between different networks, protocols such as ICMP IGMP IP.
  • Data link layer: Establishes data links between adjacent nodes on the basis of providing bitstream services at the physical layer.
  • Physical layer: establishes, maintains, and disconnects physical connections.

6. How thread pools work

If the interviewer asks us to explain how thread pools work, the following flow chart will do the job:

To visualize thread pool execution and help you understand it better, let me use a metaphor:

  • Core threads are likened to company employees
  • Non-core threads are likened to outsourced employees
  • Blocking queues are compared to demand pools
  • Submitting a task is compared to making a request

  • When the product asks for a request, the regular staff (core thread) picks up the request first (perform the task).
  • If all the regular employees have requirements to work on, i.e. the number of core threads is full), the product puts the requirements in the requirements pool first (blocking the queue).
  • What if the demand pool (blocking queue) is also full, but the product continues to demand? Outsourcing (non-core threads) to do it.
  • If all employees (the maximum number of threads is full) have requests to do, then the rejection policy is implemented.
  • If the outsourcing employee completes the requirements, he leaves the company after a keepAliveTime.

7. How do you achieve high availability of your database?

High Availability, or High Availability, is one of the factors that must be considered in the design of distributed system architecture. It usually refers to the reduction of the time that the system cannot provide services through design. Single-node deployment does not have high availability due to single point of failure. High availability is multi-node. When considering the architecture of high availability of MySQL database, we need to consider the following aspects:

  • If the database node breaks down, ensure that services are not affected by the breakdown.
  • The data of the secondary database node should be as consistent as possible with the data of the master node in real time, at least to ensure the final consistency.
  • Data cannot be lost during database node switchover.

7.1 Primary/Secondary or Semi-synchronous Replication

Using dual – node database, set up one – way or two – way semi-synchronous replication. The structure is as follows:

It is often used in conjunction with third-party software such as proxy and Keepalived to monitor the health of the database and perform a series of administrative commands. If the primary database fails, you can still use the database after switching to the standby database.

The advantages of this solution are that the architecture and deployment are relatively simple, and the switch can be performed directly when the host is down. The disadvantage is that it is completely dependent on semi-synchronous replication, which degrades to asynchronous replication and cannot ensure data consistency. In addition, haProxy, keepalived and high availability mechanisms need to be considered.

7.2 Optimization of Semi-synchronous Replication

The semi-synchronous replication mechanism is reliable and can ensure data consistency. However, if the network fluctuates, the semi-synchronous replication times out and is switched to asynchronous replication. Therefore, data consistency cannot be guaranteed in heterogeneous replication. Therefore, you can optimize on the basis of semi – identical replication, as much as possible to ensure semi – identical replication. For example, the two-channel replication scheme

  • Advantages: This solution architecture and deployment is relatively simple, the host downtime can be directly switched. Compared with the semi-synchronous replication in scheme 1, data consistency is better guaranteed.
  • Disadvantages: Need to modify the kernel source code or use the mysql communication protocol, does not fundamentally solve the data consistency problem.

7.3 Ha Architecture Optimization

To ensure high availability, the primary and secondary databases can be extended to a database cluster. Zookeeper can be managed as a cluster. It uses a distributed algorithm to ensure data consistency in the cluster and avoid network partitioning.

  • Advantages: High availability and scalability of the entire system are guaranteed. The system can be expanded to a large-scale cluster.
  • Disadvantages: Data consistency still relies on native mysql semi-synchronous replication; Introducing Zookeeper makes the system logic more complex.

7.4 Shared Storage

Shared storage decouples database servers from storage devices. Data synchronization between different databases does not depend on the native replication function of MySQL, but ensures data consistency through disk data synchronization.

DRBD Disk replication

DRBD is a software-implemented, shared-nothing, storage replication solution that mirrors the contents of block devices between servers. Data is mirrored on disks, partitions, and logical volumes between servers. When data is written to the local disk, the data is also sent to the disk of another host on the network. In this way, data on the local host (active node) and the remote host (standby node) can be synchronized in real time. The common architecture is as follows:

When the local host is faulty, the remote host still retains the same data, which ensures data security.

  • Advantages: Simple deployment, reasonable price, and strong data consistency
  • Disadvantages: Greatly affects I/O performance. The secondary library does not provide read operations

7.5 Distributed Protocol

Distributed protocol can solve the data consistency problem well. A common deployment scheme is MySQL Cluster, which is an official cluster deployment scheme. It uses the NDB storage engine to back up redundant data in real time to achieve high availability and data consistency of the database. As follows:

  • Advantages: do not rely on third-party software, can achieve strong data consistency;
  • Disadvantages: Complex configuration; Need to use NDB storage engine; At least three nodes;

8. How to ensure that the latest data can be read from the database when read and write are separated?

Database read and write separation, mainly to solve high concurrency, improve system throughput. Look at the read-write split database model:

  • A write request writes data directly to the primary and then synchronizes data to the secondary
  • Generally, a read request is directly read from the secondary database, except for a forced read request from the primary database

In scenarios with high concurrency or poor network, if there is a large delay in data synchronization between the primary and secondary libraries, the read request to the secondary library will read the old data. The easiest way to do this is to force the master to read. You can actually use the cache notation.

  • A initiates A write request to update the master database data and sets A tag in the cache to indicate that the data has been updated. The tag format is userId+ service Id.
  • Set this flag to set the expiration time (the time to estimate the synchronization delay between the master and slave repositories)
  • B Initiates a read request and checks whether the request is updated in the cache.
  • If there is a tag, go to the main library. If not, the request goes from the repository.

This solution solves the data inconsistency problem, but each request has to deal with the cache first, which will affect the system throughput.

9. How to ensure that MySQL data is not lost?

MySQL is a relational database with a write-ahead Logging strategy. As long as the binlog and redo log are persistent to disk, we can ensure that data is not lost if MySQL restarts abnormally.

Binlog log

A binlog, also known as a binary log, records all operations in which the database performs changes, except for operations such as querying select. Generally used for recovery and replication. It comes in three formats: Statement, Mixed, and Row.

  • Statement: Each SQL that modifies data is recorded in the binlog. This parameter is not recommended.
  • Row: Records the content before and after a row change. It is recommended.
  • Mixed: Indicates that the statement and row modes are mixed. It is not recommended.

What is the writing mechanism of binlog?

Logs are written to the binlog cache during transaction execution, and then to the binlog file when the transaction is committed

.

The system allocates a binLog cache for each client thread. The size control parameter is binlog_cache_size. If the value of the binlog cache exceeds the threshold, it will be temporarily persisted to disk. When the transaction commits, the complete transaction in the binlog cache is persisted to disk and the binlog cache is cleared.

Binlog writes files in two ways: write and fsync:

  • Write: Indicates that logs are written to the page cache of the file system. Data is not persisted to disk and therefore is faster.
  • Fsync, the actual write to the disk, that is, persisting data to the disk.

The timing of write and fsync is controlled by the sync_binlog variable:

If I/O performance bottlenecks occur, you can set sync_binlog to a large value. For example, set it to 100 to 1000. However, there is a risk of data loss. When the host restarts unexpectedly, N recently committed transaction binlogs will be lost.

Redo log log

The redo log, also known as a redo log file, records only the changes made to the data page by the transaction. It records the value of the changed data. Redo has three states

  • Physically in the MySQL process memory, stored in the redo log buffer,
  • It is physically written to disk from the file system’s Page cache, but is not persisted (fsync).
  • Hard Disk exists and is persisted to the disk.

Log writing to the redo log buffer is fast; Wirte to page cache is also fast, but persisting to disk is much slower.

In order to control the redo log write policy, Innodb uses a different policy based on the innodb_flush_log_at_trx_commit parameter, which has three different values:

    1. If the redo log is set to 0, the redo log is stored in the redo log buffer for each transaction commit; if the redo log is set to 0, the redo log is stored in the redo log buffer for each transaction commit.
    1. If the value is set to 1, the redo log is directly persisted to disk for each transaction commit.
    1. If the redo log is set to 2, the redo log is only written to the page cache for each transaction commit.

Of the three modes, 0 has the best performance, but it is not safe. If the MySQL process crashes, it will lose one second of data. 1 has the highest security, but the greatest impact on performance, 2 mainly by the operating system to control the disk time, if only MySQL down, the data will not be affected, if the host is abnormal down, the data will also be lost.

10. How to design a seckill system under high concurrency?

When designing a seckill system, consider these questions:

How to solve these problems?

  • Page statics
  • Button to gray control
  • Single duty of service
  • Add salt to seckill link
  • Current limiting
  • A distributed lock
  • MQ asynchronous processing
  • Current limit & downgrade & circuit breaker

Page statics

Second kill activity page, most of the content is fixed, such as commodity name, commodity pictures and so on, can do static processing of the active page, reduce the request to access the server. Seckill users will be distributed all over the country, some in Shanghai, some in Shenzhen, the region is very different, the Internet speed is also different. Content Delivery Network (CDN) can be used to give users the fastest access to the active page. CDN allows users to obtain the required content nearby.

Button to gray control

Before the seckill activity begins, the button generally needs to be deactivated. Only when the time is up can it become clickable. This is to prevent the user from frantically requesting the server a few seconds before the time is up, and then the server will hang up before the time is up.

Single duty of service

We all know the concept of microservice design, which is to break down the functional modules, put them together in a similar way, and deploy them in a distributed way.

Such as user login related, design a user service, order related to make an order service, and then to the gift related to make a gift service and so on. Then, the business logic related to seckill can be put together, create a seckill service, and create a separate seckill database for it.

One advantage of having a single service responsibility is that if seckill fails to withstand high concurrency pressures, the seckill library collapses and the service crashes, it will not affect other services in the system.

Add salt to seckill link

If the link is exposed in plain text, someone will get the request Url and kill it in advance. Therefore, you need to salt the seckill link. You can make the URL dynamic, such as using MD5 encryption to encrypt random strings to make the URL.

Current limiting

There are two traffic limiting methods: nginx traffic limiting and Redis traffic limiting.

  • In order to prevent a user from requesting too much, we can limit the flow to the same user.
  • To prevent scalpers from simulating several user requests, we can limit the flow of one IP.
  • To prevent someone from using the proxy and changing the IP request with each request, we can limit the traffic on the interface.
  • Ali’s Sentinel and Hystrix components can also be used to limit the flow in order to prevent the system from being overwhelmed by instantaneous excessive traffic.

A distributed lock

Redis distributed locks can be used to solve the oversold problem.

Use Redis’ SET EX PX NX + to check the unique random value and then delete the lock.

If (jedis.set(key_resource_id, uni_request_id, "NX", "EX", 100s) == 1){// add the lock try {// do something}catch(){} finally {// determine if the lock is added by the current thread (uni_request_id.equals(jedis.get(key_resource_id))) { jedis.del(lockKey); // Release the lock}}}Copy the code

In this case, determining whether the current thread added the lock and releasing the lock is not an atomic operation. If jedis.del() is called to release a lock, it is possible that the lock does not belong to the current client.

For more rigor, lua scripts are often used instead. The lua script is as follows:

if redis.call('get',KEYS[1]) == ARGV[1] then 
   return redis.call('del',KEYS[1]) 
else
   return 0
end;
Copy the code

MQ asynchronous processing

If the instantaneous traffic is particularly heavy, you can use message queues to peak and process asynchronously. When a request comes in, it is put on the message queue and then taken out for consumption.

Current limit & downgrade & circuit breaker

  • Traffic limiting is to limit requests to prevent excessive requests from crushing the server;
  • Demote, is the seckill service has a problem, demote processing, do not affect other services;
  • Circuit breaker, the service has a problem on the circuit breaker, the general circuit breaker is demoted together.

Reference and Thanks

  • Five common high availability solutions for MySQL
  • How do I maintain data consistency in a separate read/write database
  • “Let’s go to Dafang” series – Seckill system design
  • Geek Time: MySQL45 In Action
  • MySQL > MySQL > Lose data