Talk about how SpringMVC works and common annotations
1. The user sends a request to the server, which is intercepted by SpringMVC’s front-end controller DispatcherServlet. 2. The DispatcherServlet parses the requested URL (Uniform resource Locator) and gets the URI(request resource Identifier). Then, according to the URI, it calls HandlerMapping through configuration or annotation to find all relevant objects configured by the Handler. This includes the Handler object and its corresponding interceptor, all of which are wrapped into a HandlerExecutionChain object that returns the DispatcherServlet. 3. According to the obtained Handler, the front-end controller requests the HandlerAdapter to process a variety of handlers, and calls the method of the Handler actually processing the request. 5. After the Handler execution is complete, return a ModelAndView object to the DispatcherServlet. 6. According to the returned ModelAndView object, request the ViewResolver(view resolver) to parse the logical view into a real view and return the view to the front-end controller. 8. Return the response result to the client.
What’s the difference between A URL and a URI?
Uris include urls, a person’s id number, and urNs, a URL that uniquely identifies a person and tells the mailman how to get your product to you.
Component annotations:
@Component adds the @Component annotation before the class definition, which is recognized by the Spring container and turned into a bean. @Repository annotates Dao implementation classes (special @Component) @Service annotates the business logic layer, (special @Component) @Controller annotates the control layer, (special @Component)
Request and parameter annotations:
@requestMapping: Handles request address mapping, which applies to classes and methods. @requestParam: used to get the value of the passed parameter @pathViriable: used to define the value of the path parameter @responseBody: used on a method to return the entire result in some format, such as JSON or XML. CookieValue: Used to retrieve the Cookie value of the request
Talk about Spring’s IOC (DI) and AOP dynamic proxies
Traditional programming (no IOC) : For a simple example, how do we find a girlfriend? The common situation is that we go around to see where there are beautiful and nice MMS, and then ask about their interests, QQ number, telephone number, IP number, IQ number… Find a way to get to know them, give them what they want, and heh heh… The process is complex and profound, and we have to design and face each step ourselves.
IOC:
How does the IoC do it? It was a bit like finding a girlfriend through a matchmaking agency, which introduced a third party between me and my girlfriend: the matchmaking agency. Married a lot of men and women data interface management, I can put forward to dating a list, telling it what I want to find a girlfriend, looks like li jia hin, for example, figure like Lin Xilei, singing like jay Chou, speed technology like zinedine zidane, like carlos, then dating would, according to the requirements of our provide a mm, We just need to fall in love with her and get married.
Summary Inversion of control: All classes register in the Spring container, telling Spring what you are and what you need, and then spring will voluntarily give you what you need when the system is running properly, while also handing you over to other things that need you. The creation and destruction of all classes is controlled by Spring, which means that it is not the reference object that controls the life cycle of an object, but Spring. For a specific object, it used to control other objects, but now all objects are controlled by Spring, so this is called inversion of control.
The key to understanding DI is: “Who depends on whom, why, who injects whom, and what” ● Who depends on whom: Applications depend on the IoC container, of course; ● Why dependencies: Applications need IoC containers to provide external resources that objects need; ● Who injects whom: It is obvious that the IoC container injects an application object, an object that the application depends on; ● What is injected: It is the external resources (including objects, resources, constant data) needed to inject an object.
What is the relationship between IoC and DI?
DI(dependency injection) is just another word for IOC, but they are two different ways of describing the same concept
Various implementations of AOP
AOP is faceted programming, and it can be implemented at several levels
- Modify the source code at compile time
- Modify the bytecode before the runtime bytecode is loaded
- The bytecode of the proxy class is dynamically created after the runtime bytecode is loaded
Comparison of various AOP implementation mechanisms
Here is a comparison of the various implementation mechanisms:
category | mechanism | The principle of | advantages | disadvantages |
---|---|---|---|---|
Static AOP | Static weave | At compile time, the facets are compiled directly into the target bytecode file as bytecode | System performance is not affected | Not flexible enough |
Dynamic AOP | A dynamic proxy | At run time, after the target class is loaded, a proxy class is dynamically generated for the interface into which facets are woven | More flexible than static AOP | The concern to cut into needs to implement the interface. There is a slight performance impact on the system |
Dynamic bytecode generation | CGLIB | At runtime, after the target class is loaded, the bytecode file is dynamically constructed to generate the subclass of the target class, and the aspect logic is added to the subclass | It can be woven without interfaces | When the instance method of the extension class is final, it cannot be woven |
Custom class loaders | At runtime, before the target is loaded, facet logic is added to the target bytecode | You can weave in most classes | If other class loaders are used in the code, these classes will not be woven in | |
Bytecode conversion | At run time, all class loaders intercept bytecodes before loading them | All classes can be woven in |
Citizens in AOP
- Joinpoint: Interception point, such as a business method
- Pointcut: An expression in Joinpoint indicating which methods to intercept. One Pointcut corresponds to multiple JoinPoints
- Advice: Logic to cut into
- Before Advice: Cut in Before method
- After Advice: Cut After method, throw exception does not cut
- Advice: Cuts After Returning a method, but does not return an exception
- After Throwing Advice: Access when the method throws an exception
- Around Advice: Before and after method execution, you can interrupt or ignore the execution of the original process
The dynamic proxy used by Spring AOP means that the AOP framework does not modify the bytecode, but instead temporarily generates an AOP object in memory for a method that contains all the methods of the target object, enhances them at specific pointcuts, and calls back the methods of the original object.
There are two main types of dynamic proxies in Spring AOP, JDK dynamic proxies and CGLIB dynamic proxies. JDK dynamic proxies receive proxied classes through reflection and require that the proxied classes implement an interface. At the heart of JDK dynamic proxies are the InvocationHandler interface and Proxy class. If the target class does not implement the interface, Spring AOP chooses to use CGLIB to dynamically proxy the target class. CGLIB (Code Generation Library) is a Code Generation Library that dynamically generates subclasses of a class at runtime. Note that CGLIB is dynamically proxyed by inheritance, so if a class is marked final, It cannot use CGLIB as a dynamic proxy.
In terms of transaction management, Spring uses AOP to accomplish declarative transaction management in annotation and XML forms. Most of the time, the transaction manager is configured in the Spring configuration file and transaction control annotations are turned on. Add @Transactional to a business class or business class method to implement transaction control.
Talk about the MyBatis framework
(1) MyBatis is a Java-based persistence layer framework, which encapsulates JDBC internally. It does not need to spend energy to deal with the process of loading drivers, creating connections and so on, eliminating a large number of redundant CODES of JDBC. (2) Mybatis configures various statements to be executed through XML or annotations, and generates the final EXECUTED SQL statements by mapping Java objects and dynamic parameters of SQL in statement. Finally, the MYBatis framework executes the SQL and maps the results to Java objects and returns them. (3) MyBatis supports customized SQL, stored procedures and advanced mapping. MyBatis avoids almost all of the JDBC code and manual setting of parameters and fetching result sets. MyBatis can configure and map native information using simple XML or annotations to map interfaces and Java POJOs to records in the database. (4) Many third-party plug-ins (paging plug-ins/reverse engineering) are provided; (5) Good integration with Spring; (6) MyBatis is quite flexible, SQL written in XML, completely separated from the program code, remove the COUPLING of SQL and program code, easy to unified management, support to write dynamic SQL statements. (7) Provide mapping labels to support ORM field relational mapping between objects and databases. (8) SQL statements rely on the database, resulting in poor database portability, can not be replaced at will database.
Talk about the characteristics of SpringBoot
Springboot is used to simplify the initial setup and development process of Spring applications. Using a specific way to configure (properties or YML files), you can create a separate Spring reference to the main method to run Tomcat embedded by Springboot Simplifying Maven configuration without deploying a WAR file
Talk about thread creation and the differences between different ways to implement threads
1: inherits Therad class. 2: implements Runnable interface. 3: implements Callable interface
Inherit the Thread class and override the run method
class A extends Thread{ public void run(){ for(int i=1; i<=100; i++){ System.out.println(“—————–“+i); } } } A a = new A(); a.start();
Implement the Runnable interface and implement the run method inside
class B implements Runnable{ public void run(){ for(int i=1; i<=100; i++){ System.out.println(“—————–“+i); } } } B b = new B(); Thread t = new Thread(b); t.start();
Implement Callable
class A implements Callable<String>{
public String call() throws Exception{
//…
}
}
FutureTask<String> ft = new FutureTask<>(new A());
new Thread(ft).start();
The thread pool
ExcutorService es = Executors.newFixedThreadPool(10); Es. Submit (new Runnable(){// task}); Es. Submit (new Runnable(){// task}); . es.shutdown();
What’s the difference between implementing Runnable and implementing Callable?
Implement Callable interface, the task can have a return value, Runnable does not. Implement the Callable interface, you can specify generic, Runnable does not. Implement Callable interface, can declare exceptions in the call method, Runnable does not.
What is the difference between Runnable and Thread?
The Runnable interface is implemented in a way that is better suited for dealing with shared resources. The Runnable interface is implemented in a way that avoids the limitations of single inheritance.
Java custom class loaders with parental delegation model
Bootstrap C++ Extension class loader Java application class loader Java AppClassLoader
The parent delegation model works: if a class loader receives a request for a class load, it first does not try to load the class itself. Instead, it delegates the request to the parent class loader. This is true for every classloader, and only if the parent cannot find the specified class in its search scope (that is, ClassNotFoundException) will the child loader attempt to load it itself.
Talk about JVM composition and tuning, memory model, GC, Tomcat tuning
Tomcat tuning:
Increase JVM heap memory size Fix JRE memory leak Thread pool Settings compress database performance tune Tomcat local libraries
The JVM tuning:
-xmx — Specifies the maximum memory, which defaults to 1/4 of the physical memory. -xx :+PrintGCDetails: Prints detailed GC processing logs for these configuration changes to take effect after restarting your Tomcat server.
Explain how to implement high availability data and services, load balancing strategies and differences, distribution (and transactions), clustering, high concurrency and problems encountered and solutions
Distributed:
Distributed architecture: the system is divided into multiple subsystems according to modules. Multiple subsystems are distributed on different network computers to cooperate with each other to complete the business flow. Communication between systems is required. Advantages: the module split, the use of interface communication, reduce the coupling between modules. Divide the project into sub-projects, with different teams working on different sub-projects. To add functionality, you only need to add a subproject that calls the interfaces of other systems. Flexible distributed deployment. Disadvantages: 1. The interaction between systems needs to use remote communication, and the interface development increases the workload. 2. Some common business logic of each module cannot be shared.
Soa based architecture
SOA: Service-oriented architecture. That is, the project is divided into service layer, performance layer two projects. The service layer contains the business logic and only needs to provide services externally. The presentation layer only needs to handle the interaction with the page, and the business logic is implemented by invoking the services of the service layer.
What are the differences between distributed and SOA architectures?
SOA, mainly from the point of view of services, divides the project into service layer and presentation layer. Distributed, mainly from the perspective of deployment, applications are categorized according to access pressure. The main goal is to make full use of server resources and avoid uneven resource allocation
The cluster:
A cluster system is a group of loosely combined server groups that form a virtual server to provide unified services for client users. It is common for this client to access the clustered system without being aware of which server its services are provided by. The purpose of clustering is to implement load balancing, fault tolerance, and disaster recovery. To meet the requirements of system availability and scalability. The cluster system should have high availability, scalability, load balancing, fault recovery and maintainability. Generally, the same project is deployed on multiple servers. Common Tomcat cluster, Redis cluster, Zookeeper cluster, database cluster
The difference between distributed and cluster:
Distributed means the distribution of different services in different places. Clustering refers to the concentration of several servers to implement the same service. In a word: distributed work in parallel, cluster work in series.
Each node in the distributed system can be a cluster. A cluster does not have to be distributed. For example, if there are many people visiting Sina.com, it can set up a cluster, put a response server in the front and several servers in the back to complete the same business. If there are business visits, the response server will check which server is not very heavy and assign it to which one to complete the business. And distributed, in a narrow sense, is similar to a cluster, but it is more loosely organized, unlike a cluster, where there is an organization, one server goes down, other servers can take over. Distributed each node, all complete different business, a node collapsed, which business is not accessible.
Distribution improves efficiency by shortening the execution time of a single task, while cluster improves efficiency by increasing the number of tasks executed per unit time. For example, if a task consists of 10 subtasks and each subtask takes one hour to execute, it takes 10 hours to execute the task on a server. A distributed scheme is adopted to provide 10 servers, each server is only responsible for dealing with one sub-task, regardless of the dependencies between sub-tasks, and the task only takes one hour after execution. (A typical representative of this working mode is the Map/Reduce distributed computing model of Hadoop.) The cluster solution also provides 10 servers, and each server can independently handle this task. Suppose 10 tasks arrive at the same time, 10 servers will work at the same time, and an hour later, 10 tasks will be completed at the same time, so that the whole thing is still one task in an hour!
High concurrency:
What are some common ways to handle high concurrency? 1) Data layer
Database cluster and database table Hash table sub-table Sub-database Enable Index Enable cache table design Optimize Sql statement optimize cache server (improve query efficiency, reduce database pressure) Search server (improve query efficiency, reduce database pressure) Image server separation
2) Project layer
The service-oriented distributed architecture (sharing the server pressure and improving the concurrency capability) adopts the high details of concurrent access the system adopts the static page and the STATIC HTML freemaker uses the page cache and ActiveMQ to further decoupe the business and improve the business processing capability. The distributed file system is used to store massive files
3) Application layer
Nginx server to do load balancing Lvs to do layer 2 load mirroring
High availability:
Objective: To ensure that the server hardware failure service is still available, data is still saved and can be accessed. Highly available services (1) Hierarchical management: core applications and services have higher priority, for example, users pay on time is more important than whether goods can be evaluated; (2) the timeout Settings: set the timeout of the service call, once the timeout, communication framework throws an exception, the application according to the service scheduling strategy choose retry or requests to other servers (3) the asynchronous invocation: through asynchronous message queue, to avoid a service failure led to the entire application request failed. Not all services can be invoked asynchronously, and for calls such as retrieving user information, the response time is longer than the cost. Asynchronous invocation is also not suitable for applications that must confirm the success of the service invocation before proceeding to the next step. (4) Service degradation: In order to ensure the normal operation of core applications, service degradation is required during peak visits to the website. There are two ways to degrade: One is service denial, which denies calls from applications with lower priorities to reduce the number of concurrent service calls and ensure the normal operation of core applications. The second is to close the function, shut down some unimportant services, or shut down some unimportant functions within the service, in order to save system overhead and release resources for core application services; (5) Idempotent design: to ensure that the repeated invocation of the service and the invocation of the same result;
High availability of data There are two main means to ensure high availability of data: data backup, failover mechanism; (1) Data backup: It is also divided into cold backup and hot backup. Cold backup is periodically replicated and cannot guarantee data availability. Hot backup is divided into asynchronous hot backup and synchronous hot backup. Asynchronous hot backup means that data copies are written asynchronously, while synchronous hot backup means that data copies are written simultaneously. (2) Failover: If any server in the data server cluster goes down, all read and write operations on this server are rerouted to other servers to ensure that data access will not fail.
(1) Collection of monitoring data. (1) Collection of user behavior logs: collect logs from the server and the client. At present, many websites are gradually developing log statistics and analysis tools based on real-time computing framework Storm. (2) Server performance monitoring: collect server performance indicators, such as system Load, memory occupancy, disk IO, etc., and judge in time to nip in the wind; (3) Running data report: collection and report, summary after unified display, the application needs to deal with the logic of running data collection in the code; (2) Monitoring management (1) system alarm: configure alarm threshold and the contact information of the personnel on duty, the system alarm, even if the engineer is thousands of miles away, can also be notified in time; (2) Failure transfer: when the monitoring system finds a fault, it actively notifies the application of failure transfer; (3) Automatic elegant downgrade: in order to cope with the peak site visits, take the initiative to close some functions, release some system resources, to ensure the normal operation of core application services; — > The ideal state of the flexible architecture of the website
Load balancing:
What is load balancing? When the performance of a single server reaches its limit, we can use a cluster of servers to improve the overall performance of the site. Therefore, in a server cluster, one server needs to act as the scheduler. All requests from users will be received by it first, and then the scheduler will allocate the requests to a certain back-end server for processing according to the load of each server. (1) HTTP redirection load balancing. Principle: When a user sends a request to the server, the request is first intercepted by the cluster scheduler. The dispatcher selects a server according to a certain allocation policy, encapsulates the IP address of the selected server in the Location field of the HTTP response message header, sets the status code of the response message to 302, and finally returns the response message to the browser. When the browser receives the response message, it parses the Location field and sends a request to the URL. The specified server processes the user’s request and returns the result to the user.
Advantages: Relatively simple Disadvantages: the scheduling server only takes effect when the client initiates a request to the website for the first time. When the scheduling server returns the response information to the browser, the subsequent operations of the client are based on the new URL (that is, the back-end server). After that, the browser will not have a relationship with the scheduling server, and the browser needs to request the server twice each time to complete a visit, resulting in poor performance. In addition, the scheduling server does not know how much pressure the current user will put on the server when scheduling, but just allocates the number of requests evenly to each server, and the browser will interact directly with the back-end server.
(2) DNS domain name resolution load balancing principle: To facilitate users’ memory, we use domain names to access websites. Before you can access a web site using a domain name, you first need to resolve the domain name into an IP address. This is done by the DNS domain name server. The requests we submit are not sent directly to the site we want to visit, but first to the domain name server, which resolves the domain name into an IP address and returns it to us. We don’t make a request to an IP until we receive it. A domain name refers to multiple IP addresses. During domain name resolution, the DNS returns one IP address to the user to implement load balancing in the server cluster.
Scheduling policies: Generally, DNS providers provide some scheduling policies for us to choose, such as random allocation, polling, and assigning the server nearest to the requester based on his or her geography. Random allocation policy: When the scheduling server receives a user request, it randomly decides which back-end server to use and then encapsulates the IP address of the server in the Location attribute of the HTTP response message and returns it to the browser. Polling policy (RR) : The scheduling server needs to maintain a value to record the IP address of the last assigned back-end server. So when a new request comes in, the scheduler allocates the request to the next server sequentially.
Advantages: Simple configuration, load balancing is handed over to DNS, which eliminates the trouble of network management. Disadvantages: Cluster scheduling is given to the DNS server, so we can’t control the scheduler as we want, we can’t customize the scheduling policy, we can’t solve the load on each server, we just distribute all requests equally to the back-end server. When a back-end server fails, we immediately remove the server from the domain name resolution, but because the DNS server will have cache, the IP will remain in the DNS for a period of time, so that some users can not access the website properly. But dynamic DNS allows us to programmatically change the domain name resolution in the DNS server. So that when our monitor detects that a server is down, it can immediately notify DNS to remove it.
(3) Reverse proxy load balancing. Principle: The reverse proxy server is a server located in front of the actual server, all requests to our website first go through the reverse proxy server, the server will either directly return the result to the user according to the user’s request, or the request to the back-end server processing, and then back to the user. The reverse proxy server acts as a server cluster scheduler, forwarding requests to an appropriate server based on the current back-end server load, and returning the processing results to the user.
Advantages: 1. Simple deployment 2. Hiding the back-end server: Compared with HTTP redirection, a reverse proxy can hide the back-end server, and no browser can directly interact with the back-end server. 3. Failover: Reverse proxy can remove failed nodes more quickly than DNS load balancing. When the monitoring program discovers the failure of a back-end server, it can notify the reverse proxy server in time and delete it immediately. 4. Allocate tasks properly: HTTP redirection and DNS load balancing fail to achieve load balancing in the real sense, that is, the scheduling server cannot allocate tasks based on the actual load of back-end servers. However, the reverse proxy server supports manually setting the weight of each back-end server. We can set different weights according to the configuration of the server. Different weights will lead to different probabilities of being selected by the scheduler. Disadvantages: 1. Too much pressure on the scheduler: All requests are processed by the reverse proxy server first. When the number of requests exceeds the maximum load of the scheduling server, the throughput of the scheduling server decreases, directly reducing the overall performance of the cluster. 2. Limit scaling: When the back-end servers cannot meet the huge throughput, the number of back-end servers needs to be increased, but it cannot be increased indefinitely, because it will be restricted by the maximum throughput of the scheduling server. 3. Sticky sessions: Reverse proxy servers cause a problem. If a back-end server processes a user’s request and stores the user’s session or cache, then when the user sends a request again, there is no guarantee that the request will still be processed by the server that holds the user’s session or cache. If it is processed by another server, the previous session or cache will not be found. Solution 1: Modify the task assignment policy of the reverse proxy server to use the user IP address as the identifier. The same user IP is handled by the same back-end server, thus avoiding the problem of sticky sessions. Solution 2: The server ID of the request can be marked in the Cookie. When the request is submitted again, the scheduler can assign the request to the server marked with the Cookie for processing.
(4) IP load balancing 1. Load balancing through NAT: The response packets are usually large. If NAT is required every time, the scheduler becomes a bottleneck in the case of heavy traffic. 2. Load balancing through direct routing 3.VS/TUN Virtual server Advantages: IP load balancing implements data distribution in kernel processes and has better processing performance than reverse proxy balancing. Weakness: the load balance of network bandwidth becomes the bottleneck of the system, the scene: a server to run the application of the off-peak period can reach 500 m in the evening peak generally can be more than 1 g, the mainstream of the server nic is gigabit, more than 1 g flow will lead to packet loss problem obviously, at this time and can not stop the business card for replacement.
(5) Data link layer load balancing. For Linux, the solution of the data link layer is to bind multiple nics to provide services jointly, and bind them into one logical NIC. Avoiding the bandwidth of network card of load balancing server becomes the bottleneck, which is the most widely used load balancing method for large websites at present. Seven modes of Linux bonding, mod=0 to 6, are balanced round cycle policy, master-backup policy, balancing policy, broadcast policy, dynamic link aggregation, adapter transfer load balancing, and adapter adaptive load balancing
Describe how you optimized the database (SQL, table design) and the limitations of the use of indexes (index failure)
A, select the most applicable fields: When creating a table, we can set the width of the fields in the table to be as small as possible for better performance. Another way to improve efficiency is to set fields to NOTNULL, b whenever possible, use joins instead of sub-queries C, and use unions instead of manually created temporary tables D. Things like: A) Either every statement in the block succeeds, or all statements fail. In other words, the consistency and integrity of the data in the database can be maintained. Things start with the BEGIN keyword and end with the COMMIT keyword. If one of these SQL operations fails, the ROLLBACK command can restore the database to the state it was in before BEGIN. B) When multiple users use the same data source at the same time, it can use the method of locking the database to provide users with a secure access mode, so as to ensure that users’ operations will not be interfered by other users. E, reduce table association, add redundant field F, and use foreign keys: locking tables can maintain data integrity, but it does not guarantee data relevance. That’s when we can use foreign keys. G, using index H, optimized query statement I, cluster J, read-write separation K, master-slave replication L, sub-table M, sub-library O, and stored procedures when appropriate
Limitations: Try to use full-time indexes, left-most prefixes: queries start at the left-most front of the index and do not skip columns in the index; No operation on index column, all after the range is invalid; Unequal null values and OR, index influence should be paid attention to; Like Index invalidation starts with wildcard %. Index invalidation occurs when the string is not quoted
Talk about Redis cache, its data type, and the differences between other caches, and persistence, cache penetration, and avalanche its solutions
Redis is a data structure storage system in memory, a key-value type of non-relational database, persistent database, relative to the relational database (data mainly exists in the hard disk), high performance, so we generally use REDis to do cache use; Redis can be used as a registry, database, cache and message middleware because it supports a variety of data types that are easier to solve various problems. Redis Value supports five data types: String, hash, list, set, and zset (sorted set).
String: A key corresponds to a value Hash: The key is a String and the value is a key-value (map), which is suitable for storing objects. List: lists of strings in insertion order (bidirectional lists). The main commands are LPUSH and RPUSH, and can support reverse lookup and traversal of sets: A hash table type sequence of strings, no order, unique collection members, no duplicate data, and the underlying implementation is mainly a HashMap whose value is always null. Zset: Basically the same as set, except that it associates each element with a double score so that members can be sorted, and inserts are ordered.
Differences between Memcache and Redis:
Data types: Redis not only supports simple K/V type data, but also supports the storage of list, set, Zset, hash and other data structures. Memcache only supports simple K/V data. Both keys and values are strings. Reliability: Memcache does not support persistent data. Redis supports data persistence and data recovery, allowing a single point of failure. However, it also costs performance. Performance: Memcache provides higher performance than Redis for storing large data
Application Scenarios:
Memcache: suitable for more read and less write, large amount of data (some official website articles, etc.) Redis: suitable for high read and write efficiency, complex data processing business, high security requirements of the system case: Distributed system, there is the problem of sharing between sessions, so in the single sign-on, we use Redis to simulate the sharing of sessions, to store user information, to achieve session sharing of different systems;
Redis persists in two ways:
RDB (semi-persistent mode) : According to the configuration, the data in memory is persisted to a dump. RDB file (binary temporary file) on the disk in the form of snapshots. The default redis persistence mode is in the configuration file (redis. Advantages: It contains only one file. Transferring a single file to another storage medium is practical for file backup and disaster recovery. Disadvantages: If the system breaks down before persisting, data that has not been persisted will be lost
AOF (full persistence) : Write () to an appendone.aof file. Redis does not support full persistence by default. You need to change appendOnly No to appendOnly Yes in the configuration file (redis.conf)
Advantages: High data security. The log file is written in Append mode. Therefore, the content in the log file will not be damaged even if the log file breaks down. Disadvantages: AOF files are usually larger than RDB files for the same number of data sets, so RDB can recover large data sets faster than AOF;
Appendfsync always # Every time a data change occurs fsync is updated to the AOF file. This is very slow, but safe; Appendfsync everysec # call fsync every second to refresh the aof file. Fast, but may lose data within a second. Appendfsync no # Does not automatically synchronize to disk, requires OS refresh, fast but less secure;
The differences between the two persistence modes are as follows: AOF is slower than RDB in running efficiency, the synchronization policy is more efficient per second, and the synchronization disable policy is as efficient as RDB. If caching data is more secure, use aOF for persistence (such as shopping cart in project). If you want efficiency for large data sets, you can use the default. And the two persistence methods can be used together.
Redis-cluster cluster adopts a centrless structure. Each node stores data and the status of the whole cluster, and each node is connected to all other nodes. Redis-cluster is used if it is used. The cluster is built by the company’s operation and maintenance, but I don’t know much about how to build it.
In our project, there are 6 redis clusters, including 3 master servers (to ensure the voting mechanism of Redis) and 3 slave servers (high availability). Each master server has a slave server as a backup machine. All the nodes are connected to each other through the PING PONG mechanism; The client connects to the Redis cluster only by connecting to any node in the cluster. 16384 hash slots are built into the Redis-cluster. The redis-cluster maps all physical nodes to slots 0-16383 for maintenance.
Redis has transactions. A transaction in Redis is a set of commands that are either executed or not executed at all, ensuring that the commands in a transaction are executed sequentially without being inserted by other commands. Redis transactions do not support rollback operations. Redis transaction implementation, need to use MULTI (transaction start) and EXEC (transaction end) command;
Cache penetration: A cache query usually uses a key to search for a value. If there is no corresponding value, it must search for it in the database. If the value corresponding to this key does not exist in the database, and there is a large number of concurrent requests for this key, it will cause a lot of pressure on the database. This is called cache penetration
Solution: 1. Store all possible parameters in hash format and verify them at the control layer. If they do not match, they are discarded. 2. Hash all possible data into a large enough bitmap. A non-existent data will be intercepted by the bitmap, thus avoiding the query pressure on the underlying storage system. 3. If a query returns null data (whether data is nonexistent or a system failure), we still cache the null result, but its expiration time is short, no more than five minutes.
Cache avalanche: When the cache server is restarted or a large number of caches fail for a period of time, a large number of cache penetrations occur. In this way, at the moment of failure, the pressure to access the database is high, and all queries fall on the database, causing a cache avalanche. There is no perfect solution, but you can analyze user behavior and try to distribute failure points evenly. Most system designers consider locking or queuing to ensure that cached singleline (process) writes are kept from falling to the underlying storage system when a large number of concurrent requests fail.
Solution: 1. Control the number of threads that read the database write cache by locking or queuing after the cache is invalid. For example, only one thread is allowed to query data and write to the cache for a key, while the other threads wait. 2. We can use the reload mechanism to update the cache in advance, and then manually trigger the loading of the cache before the large concurrent access. Set different expiration times for different keys, so that the cache expiration time is as uniform as possible 4. Do level 2 caching, or double caching. A1 is the original cache and A2 is the copy cache. If A1 fails, access A2. Set the expiration time of A1 cache to short-term and A2 to long-term.
Redis security mechanism (how do you think about redis security?)
Vulnerability introduction: By default, redis is bound to bind 0.0.0.0:6379, which exposes redis services to the public network. If authentication is not enabled, any user can access REDis and read redis data without authorization when accessing the target server. The attacker can successfully write the public key on the REDis server by using the relevant methods of Redis in the case of unauthorized access, and then can directly use the private key to directly log in to the target host.
Solution: 1. Disable some high-risk commands. Modify the redis.conf file to disable remote modification of DB file address 2. Run the Redis service with low privileges. Create a separate user and root directory for the Redis service and configure to prohibit logins; 3. Add password authentication to Redis. Conf file and add requirepass mypassWord. 4. Prohibit Internet access to Redis. Edit the redis. Conf file and add or modify bind 127.0.0.1 to make redis available only on the current host. 5. Perform log monitoring to detect attacks in time;
Redis sentinel mechanism (after Redis2.6) sentinel mechanism: monitor: monitor whether the primary and secondary databases are running properly; Alerts: Sentry can notify administrators or other applications via the API when a monitored Redis has a problem. Automatic failover: When the primary database fails, the secondary database can be automatically converted to the primary database to achieve automatic switchover. If the master server has a password, remember to configure the access password in the Sentinel configuration file (sentinel.conf)
You can use the expire command in redis to set the expire of a key. The expire command will automatically delete the key. Application scenario: Set the preferential activity information of the limit; Some timely need to update the data, leaderboards; The time of mobile verification code; Limit the frequency of visitors to your site;
Talk about the differences between ActiveMQ and other messaging middleware
The principle of activemq
How it works: Producers produce messages and send them to ActivemQ. Activemq receives the message, looks at how many consumers there are, and forwards the message to the consumer without the involvement of the producer. The consumer has nothing to do with the producer when it receives the message
Contrast the RabbitMQ
RabbitMQ uses AMQP and ActiveMQ uses JMS. As the name implies, JMS is a transmission protocol for The Java system, and there must be JVMS at both ends of the queue. Therefore, if the development environment is Java, ActiveMQ is recommended to be used. Some Objects of Java can be used for transmission, such as Map, Blob (binary big data), Stream, etc. AMQP general-purpose lines are stronger, and are often used in non-Java environments, where the transfer content is a standard string. RabbitMQ can be tricky to install. ActiveMQ decompression can be used without any installation.
Comparing KafKa
Kafka outperforms ActiveMQ and other traditional MQ tools and has good cluster scalability. The disadvantages are: (1) message duplication may occur during transmission; (2) the order of sending is not guaranteed; (3) some of the traditional MQ functions are missing, such as the transaction function of messages. So Kafka is often used to handle big data logs.
Compare the Redis
In fact, Redis itself can realize the function of message queue by using List, but the function is very few, and the performance will decrease sharply when the queue size is large. It can be used in scenarios where the data volume is small and services are simple.
How to solve the problem of message duplication? The so-called message duplication is that consumers receive repeated messages. Generally speaking, we should grasp the following points to deal with this problem.
In general, we can add a table in the business segment to store whether the message was successfully executed. After each business transaction commit, we can tell the server that the message has been processed, so that even if you send the message again, it will not cause the repeat processing
The general process is as follows: The table on the business side records the ID of the message that has been processed. Each time a message comes in, it determines whether the message has been executed. If the message has been executed, it is abandoned
Talk about the asynchronous communication problem solution for distributed transactions
Problem description: After a message is sent, the sender will not wait for the receiver regardless of the result. The sender does not know the result until the receiver pushes the return receipt message. However, there is also a possibility that the sender will be lost to the sea after the message is sent. A mechanism is needed to complement this uncertainty.
Solution: You write to a lot of pen PALS, but sometimes you don’t get a reply. There are two strategies you can use for this occasional situation. One way is to set an alarm when you send a letter and set a day later to check if the recipient has received the letter. Another option is to set a time each night to check all the emails that have been sent but haven’t received a reply for a day. And then call each of them and ask them.
The first strategy is implemented as deferred queuing, and the second strategy is scheduled polling scanning.
The difference between the two is that the delay queue is more precise, but if the period is too long, the task will stay in the delay queue for a very long time, which will make the queue redundant. For example, the user a few days later to do reminder, birthday reminder. So if you encounter such a long period of events, and do not need to be accurate to the minute-second level of events, you can use periodic scan to achieve, especially the performance consumption of the large-scale scan, can be arranged to the night execution.
How to solve **** single sign-on access, Distributed session cross-domain problems
Single sign-on (SSO) is a system that trusts each other. After a module is logged in to, other modules are authenticated without repeated login. CAS single sign-on framework is adopted. Firstly, CAS has two parts: client and server. The server is a Web project deployed in Tomcat. Complete user authentication on the server. Each time you access a system module, you need to obtain a ticket from the CAS. When authentication passes, access continues. For the CAS server, the application module we access is the CAS client.
What is cross-domain?
When an asynchronous request is made, if the protocol, IP address, or port number of the request address accessed is different from that of the current site, cross-domain access is involved. When are cross-domain issues involved? Cross-domain is only involved when it comes to front-end asynchronous requests. Solutions: 1. JQuery provides jSONP implementation; 2. W3C standard provides CORS (Cross-domain resource sharing) solution.
If CAS is used, a filter is configured in web. XML to make a request for login and forward the request to CAS. The working principle is that after CAS login, the browser sends a ticket to the browser. When you log in to other projects, you will forward them to CAS with the ticket of your browser. After you arrive at CAS, you can determine whether to log in according to the ticket
Linux commands awk, cat, sort, cut, grep, uniq, wc, top, find, sed, etc
Awk: Rather than sed, which usually works on an entire row, AWK tends to divide a row into several “fields”. As a result, AWK is well suited to small data processing: cat is used to view file contents, create files, merge files, append file contents, and so on. The cut:cut command extracts columns of text from a text file or stream. Grep: is a powerful text search tool. It searches text using regular expressions and prints matching lines. Top: Used to monitor Linux system health, such as CPU and memory usage. Find: Used to search the file directory hierarchy. Sed :sed is an online editor that processes one line of content at a time
** What is a deadlock and how to solve it **Table level and row level locks, Pessimistic locks are different from optimistic locks and thread synchronization locks
Deadlock: For example, you go to an interview, the interviewer asks you, you tell me what a deadlock is and I’ll let you into the company. You said you let me in and I’ll tell you what a deadlock is
Mutually exclusive condition: The resource cannot be shared and can only be used by one process. Request and hold conditions: the process has acquired some resources, but holds on to acquired resources if other resources are blocked by requesting them. Non-preemption condition: Some system resources cannot be preempted. After a process has obtained these resources, the system cannot reclaim them but can release them when the process is finished using them. Circular wait condition: several processes form a circular chain, each occupying the next resource requested by the other.
(1) Deadlock prevention: Breaking any of the conditions necessary to cause deadlocks can prevent deadlocks. For example, requiring users to apply for all resources at once breaks hold and wait conditions; The loop waiting condition is destroyed by layering resources to obtain resources at the upper layer before applying for resources at the next layer. Prevention often reduces the efficiency of the system. (2) Deadlock avoidance: Avoidance means that the process determines whether these operations are safe each time it requests a resource, for example, using the banker algorithm. Execution of the deadlock avoidance algorithm adds overhead to the system. (3) Deadlock detection: deadlock prevention and avoidance are prior measures, and deadlock detection is to judge whether the system is in a deadlock state, if so, the implementation of deadlock release strategy. (4) Deadlock release: this is used in combination with deadlock detection, which is used by stripping. To forcibly reclaim resources owned by one process and allocate them to other processes.
Table lock: low overhead, fast lock; There are no deadlocks (because MyISAM acquires all the locks required by SQL at once); Large locking granularity has the highest probability of locking outburst and the lowest concurrency. Row-level lock: expensive, slow lock; Deadlocks occur; The lock granularity is the lowest, the probability of lock conflict is the lowest, and the concurrency is the highest.
Pessimistic locking: Always assume the worst, every time you go to get the data you think someone else will change it, so every time you get the data you lock it, so that someone else tries to get the data it will block until it gets the lock. Traditional relational database inside used a lot of this locking mechanism, such as row lock, table lock, read lock, write lock, etc., are in the operation before the first lock. Another example is the implementation of the synchronized keyword in Java. Do this via for Update
Optimistic lock: as the name implies, is very optimistic, every time to get the data, I think others will not modify, so I will not lock, but when updating, I will judge whether others have to update the data during this period, you can use the version number and other mechanisms. Optimistic locks are suitable for multi-read applications to improve throughput. Optimistic locks are provided by databases similar to write_condition. In Java. Java util. Concurrent. Atomic package this atomic variable classes is to use the optimistic locking a way of implementation of CAS. This is done through the Version version field
Synchronous lock: Scenario: During development, time-consuming operations need to be executed in child threads to prevent stuttering. Two threads respectively execute two tasks, execute at the same time, parse files at the same time, after obtaining data, insert database at the same time, because insert more tables, so easy to insert the bug of chaos
Synchronized: declares this method to be synchronized. If one method is executing and another method is called, it is in wait state. When this method completes, the unlock method can be called,wait(): release the held object lock, and the thread enters the wait pool.
Differences: Synchronized is implemented on the JVM level, so the system can monitor whether the lock is released or not. ReentrantLock is implemented in code. The system cannot automatically release the lock, but must explicitly release the lock in the finally clause in the code.
Synchronized is a good choice when concurrency is low, but ReentrantLock is a good choice when concurrency is high and performance degrades significantly.
Tell me about * *How to speed up access and how to tune program performance六四风波
Speed up access: increase network bandwidth on hardware, and server memory code processing: static pages, caching, SQL optimization, index creation and other schemes
System performance is two things: Throughput and Throughput. So the number of requests, the number of tasks per second that can be processed. Latency indicates system delay. This is the delay in the system processing a request or a task. Then the better the Latency, the higher the Throughput that can be supported. Because short Latency means faster processing, more requests can be processed.
Improved throughput: distributed clustering, module uncoupling, design mode system latency: asynchronous communication
Talk about cache design and optimization, cache and database consistency synchronization solutions
1. Reduce back-end load: For high-consumption SQL: join result sets and group statistics results; These results are cached. 2. Speed up the request response 3. Merge the massive write into the batch write: for example, the counter accumulates the redis first and then writes DB 4 in batches. Expire 5. Proactive updates: Development control lifecycle (final consistency, short intervals) 6. Caching empty objects 7. Bloon filter interception 8. Efficiency of the command itself: for example, SQL optimization, command optimization 9. Network times: reduce communication times 10. Reduce access costs: long connection/connection pool,NIO, etc. 11.IO access merge purpose: To reduce the number of cache reconstruction, data consistency as far as possible, reduce potential risks. 1. Mutex setex,setnx: if set(nx and ex) is true, no other threads rebuild the cache, and the current thread executes the cache build logic. If setnx(nx and ex) is false, another thread is building the cache, and the current thread will rest for a specified amount of time (for example, 50 milliseconds, depending on the speed of building the cache) and then execute the function again until it gets the data.
2 never expires: Hot key is nothing but concurrent special big level to rebuild the cache time is longer, if the direct set expiration time, so time to time, huge amount of visit will be oppression to the database, so the hot key to val increased a logical expiration time fields, concurrent access, whether the logic field of time value is greater than the current time, A value greater than that indicates that the cache needs to be updated, so all threads are still allowed to access the old cache because the cache is not set to expire, but another thread is allowed to refactor the cache. After the refactoring succeeds, that is, after the Redis set operation is performed, all threads can access the new content in the refactoring cache
From the cache level, there is no expiration time, so there is no problem with hot key expiration, that is, “physical” non-expiration. On a functional level, a logical expiration time is set for each value, and when the logical expiration time is exceeded, a separate thread is used to build the cache.
If the cache fails to be deleted, do not update the database. If the cache fails to be deleted, do not update the database. If the cache fails to be deleted, do not update the database. If there is not a request in the queue, then do not put a new operation in the cache. Use a while (true) loop to query the cache for about 200MS, and then send it to the queue again. Then synchronously wait for the cache update to complete.
Talk about message queues and how messages are re-consumed,What if the consumer can’t receive the message
What is a message queue? A container that holds messages during transmission.
What problems have message queues solved? Asynchronous, parallel, decoupled, queuing
Message pattern? Subscribe, peer-to-peer
Repeat consumption: A Queue can have multiple consumers, but only one consumer can consume a message. Two, message loss: 1. Use persistent message 2. Non-persistent message timely processing do not accumulate 3. Once the transaction is started, the commit() method takes care of waiting for the server to return so that the connection is not closed and messages are lost. 3. Message resending: The message is relayed to the client: 1. Use transaction session, and call rollback (). Close the transaction session before calling commit(). 3. The Session uses CLIENT_ACKNOWLEDGE signature mode, and session.recover () is called. 4. Client connection times out (perhaps executing code with a longer timeout period than configured). Iv. Do not consume: Go to Activemq. DLQ to find what is Activemq. DLQ? 1. Once the maximum number of retransmissions of a message has been exceeded, a Poison ACK is sent back to the broker to let it know that the message is considered a Poison pill. The broker then receives the message and sends it to a dead-letter queue for later analysis. 2. In ActivemQ, the dead letter queue is called Activemq.dlQ. All undeliverable messages will be sent to this queue, which is difficult to manage. 3. Therefore, you can set an individual dead-letter policy in the target policy map of the Activemq.xml configuration file, which allows you to specify specific dead-letter queue prefixes for queues or topics.
Mq consumers do not accept the presence of a message in either of two situations: 1. A processing failure refers to a RuntimeException thrown in the onMessage method of a MessageListener. 2. There are two related fields in the Message header: Redelivered defaults to false and redeliveryCounter defaults to 0. 3. The message is sent from the broker to the consumer. The consumer calls the listener. The message at the broker end redeliveryCounter++ is delayed a little bit, by default, 1s. More than six times, another specific reply is given to the broker, which sends the message directly to the DLQ. 4. If it fails twice and the consumer restarts, the broker will push back with redeliveryCounter=2. The local will only have four more retries before entering the DLQ. 5. The specific reply to the retry is sent to the broker, which sets redelivered of the message to true and redeliveryCounter++ in memory, but neither of these fields is persisted, that is, the message record in the store is not modified. So these two fields will be reset to default values when the broker restarts.
Talk about the difference between SOA and distributed, zooKeeper oractiveMQ What if the service is down
What is the difference between SOA and distributed? SOA divides the project into service layer and presentation layer. The service layer contains business logic and only needs to provide services externally. The presentation layer only needs to handle the interaction with the page, and the business logic is implemented by invoking the services of the service layer. Distributed, mainly from the perspective of deployment, applications are categorized according to access pressure. The main goal is to make full use of server resources and avoid uneven resource allocation
What if activeMQ’s service hangs? 1. In general, nonpersistent messages are stored in memory and persistent messages are stored in files, and their maximum limit is configured in the
node of the configuration document. However, when non-persistent messages accumulate to a certain extent and memory runs out, ActiveMQ will write non-persistent messages in memory to temporary files to free up memory. Although they are saved to a file, the difference between them and persistent messages is that the persistent messages are recovered from the file after restart, and non-persistent temporary files are deleted directly. 2. Consider high availability and realize ActivemQ cluster.
What if the ZooKeeper service hangs? Registry peer cluster, arbitrary after a crash, will automatically switch to another registry crash, all service providers and consumers can still through the local cache stateless communication service providers, any downtime, does not affect the use of the service provider all downtime, service consumers will not be able to use, and infinitely reconnection waiting for service recovery
Talk about JUC’s helper classes
ReentrantReadWriteLock CountDownLatch CyclicBarrier Semaphore