Reason: Without your own Java architecture, you just memorize interview questions by rote. Or to take understanding to memory, can be their own things, ask why?
This article reference – Yang Xiaofeng – Java core technology 36 talk. Later perfection, into their own understanding
01. Why use Java?
1. Two features: cross-platform running/garbage collector
-
Why cross-platform?Copy the code
Because the JVM, the Java VIRTUAL machine, is the equivalent of a translator, the JVM is associated with all the operating systems, it can operate all the operating systems, and it provides a unified interface, or JavaAPI, up to the JVM, so the programmer can program to the JVM, tell the JVM what he wants the operating system to do, and it will tell the operating system what it wants it to do. As long as you program for the JVM, you can make a program run on all platforms. The Java language is platform agnostic, which is why Java can be cross-platform
JDK: A development kit for programs written in the Java language. JDK includes JRE, also includes Java source compiler Javac, monitoring tool JConsole, analysis tool JVisualVM and so on
Jre: Runtime environment for Java programs, including Java virtual machines (VMS) and Basic Java libraries
-
What is garbage collector?Copy the code
To free up memory allocated to objects no longer used by the program – hence the name “garbage” – GC handles the heap frequently so that programmers don’t have to worry about memory management issues
02. Why compare Exception and Error?
1. Exceptions and errors are derived from the Throwable class. In Java, only instances of Throwable can be thrown or caught.
2.Exception is basically a program Error,Error refers to the JVM itself, the system itself
Unchecked exceptions are the type of exceptions that must be explicitly caught and handled in source code as part of compile-time checking. Unchecked exceptions are called runtime exceptions. Like NullPointerException, ArrayIndexOutOfBoundsException, usually can avoid coding logic errors, depending on the need to determine whether to need to capture, will not be enforced at compile-time.
Development note:
1. Do not use exceptions for control flow.
2. Disable e.printStackTrace() for logger.error(toString + “_” + LLDB etMessage(), e)
03. Why compare Final, Finally, Finalize?
1. Final can be used to modify classes, methods, and variables, respectively, with different meanings. The class modified by final means cannot inherit extensions, final variables cannot be modified, and final methods cannot be overridden (override).
2.finally is Java’s mechanism for ensuring that important code is executed. We can use try-finally or try-catch-finally to do things like close a JDBC connection, secure an unlock, etc.
3. Finalize is a method of the base java.lang.Object class, which is designed to ensure that an Object completes the collection of a specific resource before being garbage collected. The Finalize mechanism is now deprecated and was marked deprecated as of JDK 9
04. What are strong references, soft references, weak references and phantom references?
In the Java language, all but basic data types are object references to various objects. In Java, references are divided into four classes based on their lifetime.
A strong reference
Features:
Object obj = new Object() The reference associated with an object created by keyword new is a strong reference. When the JVM runs out of memory, the JVM would rather throw an OutOfMemoryError (OOM) runtime error that causes the program to abort than randomly recycle “alive” objects with strong references to resolve the memory problem. An ordinary object that has no other reference relationship can be garbage collected as long as it exceeds the scope of the reference or explicitly sets the corresponding (strong) reference to NULL, depending on the garbage collection policy.
2 soft references
Features:
A SoftReference is implemented using SoftReference. Soft references have a shorter lifetime than strong references. Only if the JVM thinks it is out of memory will it attempt to reclaim objects pointed to by soft references: that is, the JVM ensures that objects pointed to by soft references are cleaned up before outofMemoryErrors are thrown. A soft reference can be used in conjunction with a ReferenceQueue (ReferenceQueue), and if the object referenced by the soft reference is collected by the garbage collector, the Java virtual machine adds the soft reference to the ReferenceQueue associated with it. Later, we can call ReferenceQueue’s poll() method to check if any of the objects it cares about have been reclaimed. If the queue is empty, a NULL is returned, otherwise the method returns the previous Reference object in the queue.
Application scenario: Soft references are used to implement memory sensitive caches. If you have free memory, you can keep the cache for a while and clean it up when you run out of memory. This ensures that you don’t run out of memory while using the cache.
3 a weak reference
WeakReference is implemented through the WeakReference class. Weak references have a shorter lifetime than soft references. When the garbage collector thread scans the memory area under its control, once it finds an object with weak references, it reclaims its memory regardless of whether the current memory space is sufficient or not. Because the garbage collector is a low-priority thread, it is not necessarily quick to reclaim weakly referenced objects. Weak references can be used in conjunction with a ReferenceQueue (ReferenceQueue), and if the object referenced by a weak reference is garbage collected, the Java virtual machine adds the weak reference to the ReferenceQueue associated with it.
Application scenario: Weak applications can also be used for memory sensitive caches.
4 virtual reference
Features: Virtual references, also called phantom references, are implemented through the PhantomReference class. You cannot access any properties or functions of an object through a virtual reference. Phantom references simply provide a mechanism to ensure that objects do something after being Fnalized. If an object holds only virtual references, it can be collected by the garbage collector at any time, just as if there were no references at all. A virtual reference must be used in conjunction with a ReferenceQueue. When the garbage collector attempts to reclaim an object and finds that it has a virtual reference, it adds the virtual reference to the reference queue associated with it before reclaiming the object’s memory. ReferenceQueue queue = new ReferenceQueue (); PhantomReference pr = new PhantomReference (object, queue); A program can determine whether a referenced object is about to be garbage collected by determining whether a virtual reference has been added to the reference queue. If a program finds that a virtual reference has been added to the reference queue, it can take some program action before the referenced object’s memory is reclaimed.
Application scenario: It can be used to track the activity of objects being collected by the garbage collector. A system notification will be received when an object associated with a virtual reference is collected by the garbage collector.
What is the difference between String, StringBufer, StringBuilder?
- String
Because String is used too frequently in the Java world, Java introduced the String constant pool to avoid producing a large number of String objects in a system. When creating a string, it checks whether there are any string pairs with the same value in the pool. If there are, it does not need to create a reference to the object that was just found in the pool. If not, a new string object is created, the object reference is returned, and the newly created object is put into the pool. However, a String created with the new method does not check the String pool. Instead, it creates a new object directly on the heap or stack and does not put the object into the pool. The above principle only applies if you assign a String reference by a direct quantity.
For example: String str1 = “123”; String str2 = new String(” 123 “); // Assign by new, not in the string constant pool
Note: String provides the inter() method. When called, if the constant pool contains a String (determined by equals) equal to the String object, the String from the pool is returned. Otherwise, the String object is added to the pool and a reference to the object in the pool is returned.
- The characteristics of the String
[A] immutable. Once a String is generated, it cannot be changed. The main function of immutable is that when an object needs to be shared by multiple threads and is frequently accessed, the synchronization and lock wait time can be omitted, thus greatly improving system performance. Immutable mode is a design mode that can improve the performance and reduce the complexity of multithreaded programs.
[B] Optimization for constant pools. When two strings have the same value, they only refer to the same copy in the constant pool. This technique can save a lot of memory when the same string occurs over and over again.
- StringBufer/StringBuilder
Both StringBufer and StringBuilder implement AbstractStringBuilder abstract classes and have almost identical call interfaces. Its underlying is stored in a memory in the same as the String, is an ordered sequence of characters (char types of arrays) for storage, the difference is StringBufer/StringBuilder object value can be changed, and value changes, the object reference will not change; In the construction process of both objects, first apply for a character array according to the default size, because new data will be added continuously, when the default size is exceeded, a larger array will be created, and the contents of the original array will be copied, and the old array will be discarded. Therefore, capacity expansion for large objects involves a large number of memory replication operations. If the size can be evaluated in advance, performance can be improved.
The only caveats are:
StringBufer is thread-safe, but StringBuilder is thread-unsafe. See the Source code for the Java standard library. The StringBufer class always has the synchronize keyword in front of its method definitions. For this reason, StringBufer performs much less well than StringBuilder.
3 Application Scenarios
[A] Preferentially use the String class in business scenarios where String content does not change often. Examples: constant declarations, small string concatenation operations, etc. If you have a lot of String concatenation, avoid “+” operations between strings, because they create a lot of useless intermediate objects, consume space, and are inefficient (creating new objects and recycling objects takes a lot of time).
[B] StringBufer is recommended when performing frequent string operations (such as concatenation, substitution, deletion, etc.) and running in a multi-threaded environment, such as XML parsing, HTTP parameter parsing and encapsulation.
[C] StringBuilder is recommended for frequent string operations (such as concatenation, replacement, deletion, etc.) and run in a single-threaded environment, such as SQL statement assembly, JSON encapsulation, etc.
06. What principle is dynamic proxy based on?
- Reflection mechanism
Using Java reflection, we can load a class whose name is known at runtime, learn about its constructor, and generate its object entity, which can set its felds and invoke its methods. Gets properties and methods declared by a class, calls methods, or constructs objects
Application scenario: Reflection technology is commonly used in various general framework development. Reflection is used when different objects or classes need to be loaded and different methods called based on the configuration file in order to keep the framework generic — the runtime dynamically loads the objects that need to be loaded.
Features: Since reflection consumes additional system resources, it is unnecessary to use reflection if you do not need to create an object dynamically. In addition, reflection can ignore permission checks when calling methods, which can break encapsulation and cause security problems.
- A dynamic proxy
Provide a proxy for other objects to control access to that object. In some cases, one object is not suitable or cannot directly refer to another object, and the agent object can play the role of intermediary between the two (can be similar to the housing agent, the landlord entrust the intermediary to sell the house, sign the contract, etc.). A dynamic proxy is one in which the implementation phase does not care who the proxy is, but specifies which object the proxy is at run time (uncertainty). If you write your own proxy class the way is static proxy (deterministic).
Implementation method:
There are many ways to implement dynamic proxies, such as the DYNAMIC proxy provided by the JDK itself, which mainly utilizes reflection mechanism. There are other implementations, such as bytecode manipulation mechanisms such as ASM, CGLIB (based on ASM), Javassist, and so on.
For example, a dynamic proxy class can often be implemented using the DYNAMIC proxy interface InvocationHandler provided by the JDK. The Invoke method, which the interface definition must implement, completes the call to the real method. Through the InvocationHandler interface, all methods are handled by this Handler, that is, all proxied methods are taken over by the InvocationHandler for the actual processing. In addition, we can often add a custom logic implementation to the Invoke method implementation that does not intrude on the business logic of the propped class.
- Using the scenario
Logging, user authentication, global exception handling, performance monitoring, and even transaction processing, AOP
07. What is the difference between int and Integer?
Int is an integer and is one of Java’s eight Primitive Types (Boolean, byte, short, char, int, Foat, double, long). The Java language claims that everything is an object, but primitive data types are an exception.
Integer is the wrapper class for int. It has a field of type int to store data and provides basic operations, such as math operations and conversion between ints and strings.
In Java 5, the introduction of Boxing/Unboxing greatly simplifies programming by automatically converting Java based on context. Regarding the value caching of Integer, this involves another improvement in Java 5. The traditional way to build an Integer object is to call the constructor directly and simply new an object. In practice, however, we found that most data operations were concentrated in a limited, small range of values, so the static factory method valueOf was added in Java 5, which leverages a caching mechanism when it is called, resulting in significant performance improvements.
According to Javadoc, this value is between -128 and 127 by default.
08. What is the difference between Vector, ArrayList and LinkedList?
These three are all lists in the framework of collections, namely the so-called ordered collections, so their specific functions are relatively similar. For example, they all provide operations of locating, adding or deleting according to location, and they all provide iterators to traverse their contents. However, due to specific design differences, behavior, performance, thread safety and other aspects, the performance is very different.
Vector is a thread-safe dynamic array provided in Java’s early days. It is not recommended if thread-safe is not required, since synchronization has an overhead. A Vector uses arrays of objects to hold data inside it. It can automatically increase its capacity as needed. When an array becomes full, a new array is created and the data in the original array is copied.
ArrayList, a more widely used implementation of dynamic arrays, is not inherently thread-safe, so it performs much better. Similar to Vector, ArrayList can adjust its capacity as needed, but the logic is different. Vector doubles its capacity, whereas ArrayList increases its capacity by 50%. LinkedList is, as the name suggests, a bidirectional LinkedList provided by Java, so it does not need to be scaled as the above two do, and it is not thread-safe.
- List, the most ordered collection, provides easy access, insertion, deletion, and more.
- Set, Set does not allow duplicate elements. This is the most obvious difference from List. There are no two objects equals returns true. There are many occasions in our daily development where we need to ensure element uniqueness.
- TreeSet supports natural sequential access, but add, delete, and include operations are relatively inefficient (log(n) time).
- A HashSet, on the other hand, utilizes a hash algorithm. Ideally, if a hash hash is working, it can provide constant time additions, deletions, and inclusions, but it does not guarantee order.
- LinkedHashSet, which internally builds a bidirectional list to record the insertion order, thus providing the ability to traverse the insertion order. At the same time, it also guarantees constant time add, delete, include, etc. These operations are slightly lower than HashSet due to the overhead of maintaining the linked list.
Vector, ArrayList and LinkedList are all linear data structures, but there are differences in implementation methods and application scenarios.
1 Underlying implementation mode
ArrayList is implemented internally with arrays; LinkedList internally adopts bidirectional LinkedList; Vectors are implemented internally with arrays.
2 Read/Write Mechanism
When the number of elements inserted into the ArrayList exceeds the predefined maximum value of the current array, the array needs to be expanded. During the expansion process, the underlying system.arraycopy () method is called to perform a large number of arraycopy operations. Deleting elements does not reduce the size of the array (you can call the trimToSize() method if you need to reduce the size of the array); The array is iterated over when looking for elements, using equals for non-null elements.
When the LinkedList inserts an element, it creates a new Entry object and updates references to the elements before and after the corresponding element. To find elements, you need to traverse the linked list; To delete an element, traverse the list, find the element to delete, and then remove the element from the list.
Vector and ArrayList differ in their capacity expansion mechanisms only when inserting elements. For Vector, create an Object array of size 10 by default and set capacityIncrement to 0. If the array size is insufficient, for example, if capacityIncrement is greater than 0, the size of the Object array is expanded to the existing size+capacityIncrement. If capacityIncrement<=0, increase the size of the Object array by twice the existing size.
3 Read/write Efficiency
The addition and deletion of elements by an ArrayList causes the memory allocation of the array to change dynamically. Therefore, it is slow to insert and delete, but fast to retrieve. LinkedList is fast to add and delete elements, but slow to retrieve because it stores data in a LinkedList.
4 Thread Safety
ArrayList and LinkedList are not thread-safe. Vector is a thread-safe ArrayList based on a synchronized implementation. Note that single threads should always use ArrayList, Vector because synchronization has a performance penalty; Even in a multithreaded environment, we can use the synchronizedList(List List) method provided for us in the Collections class to return a thread-safe synchronizedList object.
09. What is the difference between Hashtable, HashMap, and TreeMap?
Hashtable, HashMap, and TreeMap are some of the most common Map implementations and are container types that store and manipulate data as key-value pairs. Hashtable is a Hashtable implementation provided by the early Java class libraries. It is itself synchronous, does not support null keys and values, and is rarely recommended because of the performance overhead associated with synchronization.
HashMap is a more widely used HashTable implementation that behaves roughly the same as HashTable, except that HashMap is not synchronous and supports null keys and values, etc. In general, HashMap performs put or GET operations to achieve constant time performance, so it is preferred for most key-value access scenarios, such as implementing a runtime storage structure for user IDS and user information.
TreeMap is a Map based on a red-black tree that provides sequential access. Unlike HashMap, its get, PUT, and remove operations are O (log(n)) time complexity. The specific order can be determined by the specified Comparator or by the natural order of keys.
Both LinkedHashMap and TreeMap guarantee some sort of order, but they are very different.
LinkedHashMap, which typically provides traversal order and insertion order, is implemented by maintaining a bidirectional linked list of items (key-value pairs).
For TreeMap, its overall order is determined by the order relationship of the keys, via a Comparator or Comparable (natural order)
HashMap source code analysis
Implement the basic point analysis inside the HashMap.
Is a composite structure of Node[] table and linked list. The array is divided into buckets. The address of the key-value pair in this array is determined by the hash value. Key-value pairs with the same hash value are stored as linked lists, as you can see in the diagram below. It is important to note here that if the list size exceeds the threshold (TREEIFY_THRESHOLD, 8), the list in the figure will be transformed to a tree structure.
Capcity and Load Factor
Why do we care about capacity and load factors?
This is because capacity and load factors determine the number of buckets available, and too many empty buckets can waste space, while too many empty buckets can seriously affect operation performance. At the extreme, if there is only one bucket, then it degenerates into a linked list and does not provide the performance of so-called constant time storage at all. Since capacity and load factors are so important, how should we choose them in practice? If you know the number of key-value pairs a HashMap will access, consider pre-setting the appropriate capacity. We can make a simple estimate of the specific value according to the conditions of expansion. According to the previous code analysis, we know that it needs to meet
Calculation conditions: Load factor * capacity > number of elements
So, the preset capacity needs to be greater than the “estimated number of elements/load factor,” and it’s a power of two, so the conclusion is pretty clear. As for the load factor, I recommend: don’t make changes without specific requirements, because the JDK’s default load factor is very generic. If you do need to adjust (the default size for HashMap is 16), it is recommended not to set a value higher than 0.75, as this can significantly increase collisions and reduce the performance of HashMap. If the load factor is too small, adjust the preset capacity value according to the formula above, otherwise it may lead to more frequent capacity expansion, unnecessary overhead, and its own access performance will be affected
An object hash conflict, all placed in the same bucket, will form a linked list
Common methods to resolve hash conflicts are:
Open addressing
The basic idea is: when the hash address of key p=H (key) conflicts, another hash address p1 is generated based on P. If P1 still conflicts, another hash address p2 is generated based on P. , until a non-conflicting hash address PI is found and the corresponding element is stored there. Hash this method is to construct a number of different hash functions at the same time: Hi=RH1 (key) I = 1,2… , k If the hash addresses Hi=RH1 (key) conflict, calculate the hash addresses Hi=RH2 (key)…… Until there is no more conflict. This method does not produce aggregation easily, but increases the calculation time.
Chain address method
The basic idea of this method is to form a single linked list called synonym chain of all elements whose hash address is I, and store the head pointer of the single linked list in the ith element of the hash table, so the search, insertion and deletion are mainly carried out in the synonym chain. The chained address method is suitable for frequent inserts and deletions. Geek time
Establish public overflow areas
The basic idea of this method is to divide the hash table into two parts: basic table and overflow table. All elements that conflict with the basic table will be filled into the overflow table.
The tree,
In JDK1.8, some changes have been made to HashMap:
In JDK1.7, key-value pairs are added to the head of the list when a hash collision occurs; in JDK1.8, key-value pairs are added to the tail of the list. In JDK1.8, if the length of the list exceeds 8, the list will be converted to a red-black tree. Capacity initialization: A HashMap of JDK1.7 initializes capacity when it is constructed, whereas a HashMap of JDK1.8 initializes capacity when it is first put into a HashMap. In other words, the HashMap of JDK1.8 uses lazy mode. Avoid the waste of resources that are not used after initialization. So why do you want to treify it?
Mainly to avoid hash collision denial-of-service attacks.
From a performance perspective: hash conflict resolution using linked lists, insert and delete are very efficient with O(1) time complexity, but for queries, O(n) time responsibility is required. However, the worst time complexity of red-black tree insertion, deletion and query is O(logn). Malicious code can interact with the server using large amounts of data. For example, the hashcode function of String is weak, and someone could easily construct a large number of strings identical to HashCode. If tens of thousands of hashcode strings are submitted to the server at a time, the server takes too long to query, hogging the server’s CPU and refusing service when more requests come in. Using red-black tree can reduce the query time to a certain order of magnitude, and can effectively avoid the hash collision denial of service attack.
How can collections be thread-safe? How does ConcurrentHashMap achieve efficient thread safety
Use the thread-safe container class provided by the send-and-send package, which provides various concurrent containers such as ConcurrentHashMap and CopyOnWriteArrayList. Various thread-safe queues (queues/deques), such as ArrayBlockingQueue, SynchronousQueue.
How can collections be guaranteed to be thread-safe
Methods for ensuring thread safety range from the simple Synchronize method to more sophisticated concurrent implementations such as ConcurrentHashMap, which is based on a separate lock implementation. CAS+synchronize
Why ConcurrentHashMap?
Hashtable itself is inefficient because its implementation basically adds “synchronized” to put, get, size, and other methods. In simple terms, this results in all concurrent operations competing for the same lock, and while one thread performs the same operation, the other threads have to wait, greatly reducing the efficiency of concurrent operations. As mentioned earlier, HashMap is not thread-safe, and concurrency can cause problems like 100% CPU usage
1.7 ConcurrentHashMap, whose implementation is based on: split lock
In other words, it will Segment the inside of a HashEntry array. Similar to a HashMap, hashentries are also stored in a linked list. Within HashEntry, volatile value fields are used to ensure visibility. The immutable object mechanism is also used to improve upon the underlying capabilities provided by using Unsafe, such as volatile access, to perform operations directly for optimal performance. After all, many of the operations in Unsafe were optimized by the JVM Intrinsics. The number of segments is determined by something called concurrentcyLevel, which defaults to 16
The get operation needs to be visible,For put, a second hash is used to avoid hash collisions, then a thread-safe PUT is performed to retrieve the Segment directly using the Unsafe call:
Therefore, it is clear from the above source code that when performing concurrent writes:
Put lock
Insert a key hash into the segment to which the element is to be added. Insert a key hash into the segment to which the element is to be added. Insert a key hash into the segment to which the element is to be added. Then, the linked list in the bucket is traversed. Replace or add nodes to the bucket for size segment calculation twice. If the result is the same twice, return
1.8 ConcurrentHashMap
Put the CAS lock
The number of segments is the same as the number of buckets. First determine whether the container is empty and initialize it if it is empty using volatile sizeCtl as mutually exclusive. If competitive initialization is found, pause there until the condition resumes. Otherwise use CAS to set the exclusive flag (U.compareAndSwapInt(this, SIZECTL, SC, -1)); Check whether the bucket is empty. If it is empty, use Synchronize to set a new node. Otherwise, use Synchronize to synchronize data in the bucket, replace or add a point to the bucket, and determine whether to convert the bucket to a red-black tree
11 What IO modes does Java provide? How does NIO achieve multiplexing?
IO
There are many Java I/O modes, which can be easily distinguished based on different I/O abstract models and interaction modes.
First, the traditional Java.io package, implemented based on a flow model, provides some of the most familiar IO features, such as File abstraction, input/output streams, and so on. The interaction mode is synchronous and blocking, that is, while reading the input stream or writing the output stream, the thread blocks until the read and write action is complete, and the calls between them are in a reliable linear order
The java. IO package has the advantage of simple and intuitive code, but has the disadvantage of I/O efficiency and scalability, which can easily become the bottleneck of application performance.
Some of the network apis provided under Java.net, such as Socket, ServerSocket, and HttpURLConnection, are also classified as synchronous blocking IO libraries because network communication is also IO behavior.
NIO
The NIO framework (java.niO package) was introduced in Java 1.4, providing new abstractions such as channels, selectors, and bufers to build multiplexed, synchronous, non-blocking IO programs, while providing high-performance data manipulation closer to the underlying operating system.
NIO2
In Java 7, NIO took a further step forward, also known as NIO 2, and introduced Asynchronous non-blocking IO, also known as AIO (Asynchronous IO). Asynchronous IO operations are based on events and callback mechanisms, which can be simply interpreted as application operations returning directly without blocking, and when background processing is complete, the operating system notifies the corresponding thread to do further work
The basic concept
1. Distinguish synchronous and asynchronous. Simply put, synchronization is a reliable and orderly mechanism. When we perform a synchronization operation, the next task is to wait for the current call to return before proceeding to the next step. Asynchronous tasks, on the other hand, do not need to wait for the current call to return, and usually rely on events, callbacks and other mechanisms to achieve the order relationship between tasks.
2. Distinguish between blocking and non-blocking. In the blocking operation, the current thread will be in the blocking state, unable to engage in other tasks, only when the conditions are ready to continue, such as the ServerSocket new connection is established, or the data read, write operation is completed; Non-blocking, on the other hand, returns directly regardless of whether the IO operation is finished, and the corresponding operation continues in the background. Synchronization or blocking cannot be regarded as inefficient, depending on the application and system characteristics. We’re all familiar with java.io, so I’m going to summarize it in general. If you need to learn more about it, you can do it in tutorials. In general, I think you need to at least understand:
IO is not only for file operations. Network programming, such as Socket communication, is a typical IO operation target. InputStream/OutputStream is used to read or write bytes, for example to manipulate image files. On the other hand, Reader/Writer is used to manipulate characters, adding functions such as character codec and decoding. It is suitable for reading or writing text information from files. Essentially, a computer operates on bytes. Whether it is network communication or file reading, Reader/Writer builds a bridge between application logic and raw data.
4. Buffer implementation such as BuferedOutputStream can avoid frequent disk reads and writes and improve I/O processing efficiency. This design takes advantage of the buffer and runs the bulk data at once, but don’t forget to use fush.
Java NIO overview
1.Bufer, an efficient data container, except for Boolean types, all primitive data types have corresponding Bufer implementations.
2.Channel, a file descriptor similar to that seen on operating systems such as Linux, is an abstraction used in NIO to support bulk IO operations. Get Channel through Socket
3.Selector is the basis for NIO to realize multiplexing. It provides an efficient mechanism to detect whether any Channel is in the ready state among multiple channels registered in the Selector, thus realizing efficient management of single thread to multiple channels.
4.Chartset, which provides Unicode string definitions, NIO also provides codecs, etc. For example, string to ByteBufer conversion by charset.defaultcharset ().encode(“Hello world! ));
Since NIO is actually synchronous non-blocking IO, a thread is synchronizing event processing, and when a group of channels are completed, it checks for channels that can be processed again. This is synchronization + non-blocking.
12 How many file copy modes are available in Java? Which is the most efficient?
IO class library, directly for the source file to build a FileInputStream read, and then for the target file to build a FileOutputStream, complete the write work
The transferTo or transferFrom method provided by java.nio class library is implemented
User Space and Kernel Space, which are the basic concepts of operating system level, operating system Kernel, hardware drivers, etc., run in kernel-state Space, with relatively high special weight; User-mode space, on the other hand, is for common applications and services.
What is the difference between an interface and an abstract class?
Interfaces and abstract classes are the two basic mechanisms of Java object-oriented design.
Interface is the abstraction of behavior, interface is a collection of abstract methods, using the interface can achieve the purpose of separating API definition and implementation. Interface, cannot be instantiated; Feld cannot contain any infinite number of members. Any feld implies a public static final. At the same time, there are no non-static methods implemented, that is, either abstract or static. The Java standard library defines a number of interfaces, such as java.util.list.
Abstract classes are classes that cannot be instantiated. The purpose of using the abstract keyword to modify class is primarily code reuse. The form is not much different from normal Java classes, except that they cannot be instantiated. They can have one or more abstract methods or none at all. Abstract classes are mostly used to extract common method implementations or common member variables of related Java classes, and then achieve code reuse through inheritance. Many common parts of Java standard libraries, such as the Collection framework, are abstracted into abstract classes, such as java.util.AbstractList.
A Java class implements interface using the implements keyword and extends abstract class using the ArrayList keyword in the Java standard library.
To do object-oriented programming, it is necessary to master the basic design principles, and TODAY I introduce the most general part, the so-called S.O.L.I.D principles. Ideally, a class or object should have a Single Responsibility. In programming, if you find that a class has multiple responsibilities, consider splitting it up.
The switch principle (open-close, Open for Extension, Close for modifcation) is that the design should be Open for extension and closed for modification. In other words, the program design should ensure smooth extensibility and try to avoid modifying existing implementations to add the same kind of functionality, which can produce fewer regression problems.
Liskov Substitution, one of the basic elements of object-oriented abstraction, allows subclasses to replace everything that can be done with a parent class or a base class in inheritance abstraction.
Interface Segregation. During class and Interface design, if too many methods are defined in an Interface, its subclasses may face a dilemma, that is, only part of the methods are meaningful to the Interface, which breaks the cohesion of the program.
In this case, you can decouple the behavior by breaking up multiple interfaces that can be single. In future maintenance, if one interface design changes, it will not affect subclasses that use other interfaces. Dependency Inversion, in which entities should rely on abstractions rather than implementations. That is, high-level modules should not depend on low-level modules, but should be based on abstractions. Practicing this principle is the key to ensuring proper coupling between production code.
What design patterns do you know? Implement the singleton pattern manually. What patterns are used in frameworks like Spring?
Creation pattern
Is a summary of various problems and solutions in the object creation process, including various Factory, Abstract Factory, Singleton, Builder and ProtoType patterns.
Structural mode
It is a summary of software design structure, focusing on class, object inheritance, combination of practical experience. Common structural patterns include Bridge, Adapter, Decorator, Proxy, Composite, Facade, Flyweight, and so on.
Behavioral pattern
A pattern summarized from the perspective of interaction between classes or objects, division of responsibilities, and so on. Common behavior patterns are Strategy, Interpreter, Command, Observer, Iterator, and Template Method), Visitor pattern.
The singleton
private satic Singleton insance = new Singleton(); public satic Singleton getInsance() { return insance; }}Copy the code
Singleton 2 lazy loading
private satic Singleton insance; private Singleton() { } public satic Singleton getInsance() { if (insance == null) { insance = new Singleton(); } return insance; }}Copy the code
Singleton lock
private satic volatile Singleton singleton = null; Private Singleton() {} public satic Singleton getSingleton() {if (Singleton == null) {// Try not to enter the synchronization block repeatedly If (Singleton == null) {Singleton = new Singleton(); synchronized(Singleton = new Singleton(); } } } return singleton; }}Copy the code
What is the difference between synchronized and ReentrantLock? Is it true that synchronized is the slowest?
Synchronized is a Java built-in synchronization mechanism that provides mutually exclusive semantics and visibility. When one thread has acquired the current lock, other threads attempting to acquire it can only wait or block.
ReentrantLock, commonly translated as reentry lock, has essentially the same semantics as synchronized. Reentry locks are obtained by code calling the lock() method directly, and code writing is more flexible. At the same time, ReentrantLock offers a number of useful ways to achieve detailed control that synchronized cannot, such as controlling the fairness, or defining conditions. However, it is important to note that the code must explicitly call unlock() to release the lock, otherwise the lock is held forever.
ReentrantLock is an implementation class of Lock, which is a mutually exclusive synchronizer.
ReentrantLock has better performance than synchronized.
Lock is flexible to use, but must have a Lock release action
Lock must be acquired and released manually, whereas synchronized does not
Lock applies only to code block locks, while synchronized can be used to modify methods, code blocks, and so on
Lock is based on AQS to achieve. AQS and Condition each maintain different queues
Understand what thread safety is.
If the state is not shared or modifiable, there is no thread-safety issue
Encapsulation: Encapsulation allows you to hide and protect the internal state of an object.
Immutable:
Thread safety requires several basic features:
Atomicity, simply speaking, means that related operations will not be interrupted by other threads in the middle of the process, usually through the synchronization mechanism.
Visibility, where a thread changes a shared variable and its status is immediately known to other threads, is often interpreted as reflecting thread-local state onto main memory. Volatile is responsible for ensuring visibility.
Orderliness ensures serial semantics in threads and avoids instruction rearrangement
Basic use of synchronized, ReentrantLock and other mechanisms and cases.
synchronized (lockObject) {
// update object state
}
Copy the code
lock.lock();
try {
// update object state
}
finally {
lock.unlock();
}
Copy the code
Synchronized to the underlying
Synchronized synchronized synchronized synchronized is monitorEnter and monitorExit when translated into a class file
The underlying principles of synchronized relate to JVM instructions and monitor. If the synchronized keyword is used, monitorenter and Monitorexit directives are present in the underlying compiled JVM directives
Monitorenter command execution:
Each object has an associated Monitor, an object instance has a Monitor, and a class object of a class has a Monitor. If you want to lock this object, you must obtain the lock of the monitor associated with this object
Principle:
Monitor has a counter that defaults to 0. If a thread wants to acquire a monitor lock, it checks whether the current counter is 0. If it is 0, it acquires the lock and increments the counter by 1.
Underlying implementation of ReentrantLock
Principle of AQS AQS: AbstractQuenedSynchronizer abstract queue type synchronizer. Is a locking mechanism in addition to Java’s native synchronized keyword
AQS core idea is that if the requested free sharing of resources, will thread is set to the current request resources effective worker threads, and set the Shared resource is locked, if requested to share resources being used, you will need to lock when a thread blocks waiting and awakened distribution mechanism, this mechanism AQS are implemented with CLH queue lock, Queues threads that temporarily cannot acquire locks.
CLH (Craig, Landin, and Hagersten) queue is a virtual bidirectional queue, that is, there is no queue instance in the virtual bidirectional queue, only the association relationship between nodes. AQS encapsulates each thread requesting shared resources into a Node (Node) of a CLH lock queue to realize lock allocation.
AQS is based on CLH queues and uses volatile to modify the shared variable state. The thread changes the state through CAS. If the thread succeeds in obtaining the lock, it will enter the waiting queue and wait to be woken up if it fails.
** Note: AQS are spinlocks: ** When waiting for wake up, spin is often used (while(! Cas ())) attempts to acquire the lock until it is successfully acquired by another thread
AQS locks are: spin lock, mutex lock, read lock write lock, conditional yield, semaphore, fence are derivatives of AQS
Has been working on JUC. All locks are implemented based on AQS. AQS and Condition maintain separate queues, and when using LOCK and Condition, the two queues move each other. If we want to customize a synchronizer, we can implement AQS. It provides ways to acquire shared locks and mutex, both based on the state operation. ReentranLock This is reentrant. In fact, you have to figure out why it’s reentrant and how it works. In fact, it is internally defined by the synchronizer Sync, which implements BOTH AQS and AOS, which provides a way to hold mutex. Each time the lock is acquired, check whether the current maintenance thread is the same as the current request thread.
Understand lock inflation and degradation; Understand the concepts of skew lock, spin lock, lightweight lock, heavyweight lock, etc.
Todo post improvement
16 How to implement synchronized low-level? What are the upgrades and downgrades of locks?
Synchronized blocks are implemented by a pair of monitorEnter/monitorExit directives, and Monitor objects are the basic implementation units of synchronization. Prior to Java 6, the implementation of Monitor relied entirely on the operating system’s internal mutex, and synchronization was an undifferentiated heavyweight operation because of the need to switch from user to kernel mode. In the modern (Oracle) JDK, the JVM takes a drastic step forward with three different Monitor implementations, often referred to as Biased Locking, lightweight Locking, and heavyweight Locking, which greatly improves performance.
The so-called upgrade and degradation of lock is the mechanism by which JVM optimizes synchronized operation. When JVM detects different competition conditions, it will automatically switch to the appropriate lock implementation, which is the upgrade and degradation of lock. Deflection locks are used by default when no race is present. The JVM uses the CAS operation (compare and Swap) to set the thread ID in the Mark Word section of the object’s head to indicate that the object is biased towards the current thread, so no true mutex is involved. This is based on the assumption that in many application scenarios, most objects will be locked by at most one thread during their lifetime, and using skew locks can reduce uncontested overhead.
If another thread tries to lock an object that has been skewered, the JVM needs to revoke the skew lock and switch to a lightweight lock implementation. Lightweight locks rely on the CAS operation Mark Word to attempt to obtain the lock. If the retry succeeds, the normal lightweight lock is used. Otherwise, upgrade to a heavyweight lock. I’ve noticed that some people think that Java doesn’t do lock degradation. In fact, as far as I know, lock degradation does occur, and when the JVM enters a SafePoint, it checks for idle Monitors and tries to downgrade them.
Spin locking is implemented by letting the current thread execute in a loop, and only when the loop condition is changed by other threads can it enter the critical region. Since the spin lock simply keeps the current thread running through the loop body without changing thread state, the response is faster. However, when the number of threads keeps increasing, performance degrades significantly.
Threads are not hotly contested and hold locks for a period of time. Suitable for use with spin locks. The reason why spinlocks are proposed is because mutex, sleep and wake up in threads are complex and expensive operations that require a lot of CPU instructions. If the mutex is only locked for a short period of time, the operation time used to sleep and wake up the thread is longer than the sleep time, and more likely not as long as the continuous polling on the spin lock.
Of course, spinlocks are held longer, and other threads trying to acquire the spinlock will always poll for the status of the spinlock. This is a huge waste of CPU. On a single-core CPU, a spinlock is useless because if a spinlock attempts to acquire the lock and fails, it keeps trying. This keeps using the CPU, making it impossible for other threads to run, and the current thread cannot release the lock because other threads cannot run.
Hybrid mutex, on multi-core systems, initially behaves like a spin lock. If a thread fails to acquire the mutex, it does not immediately switch to sleep and remains unable to acquire the mutex for a period of time. Hybrid spinlocks, which initially behave like normal spinlocks, may abandon the thread if the mutex cannot be acquired and allow another thread to execute.
Remember, spinlocks only work on multi-core cpus, single-core cpus have no effect and are a waste of time.
17 What happens when a thread calls start() twice?
Java threads are not allowed to start twice, the second call will inevitably throw IllegalThreadStateException, this is a runtime exception, multiple calls the start is regarded as a programming error.
The different states of the Thread lifecycle, which are explicitly defined in their common internal enumeration type java.lang.thread. State, are:
NEW, which represents the state in which a thread is created but not actually started, can be thought of as an internal Java state.
RUNNABLE: indicates that the thread is already running in the JVM, but since execution requires computing resources, it may be running or waiting for the system to allocate a CPU fragment to it and queue it up in the ready queue.
In some other analyses, an additional state of RUNNING is distinguished, but not represented from a Java API perspective.
BLOCKED, a state very relevant to synchronization described in the previous two lectures, indicates that a thread is waiting for a Monitor Lock. For example, if a thread attempts to acquire a lock through synchronized, but another thread has monopolized it, the current thread is blocked.
WAITING: An operation is being taken by another thread. A common scenario is the producer-consumer pattern, where a task condition is not met and the current consumer thread waits while another producer thread prepares the task data, and then notifies the consumer thread that it can continue working through actions such as notify. Thread.join() also puts threads into a wait state.
TIMED_WAIT, which enters with conditions similar to those of the wait state, but calls methods with timeout conditions, such as the timed out version of a wait or join method, as shown in the following example: public fnal native void wait(long timeout) throws InterruptedException;
TERMINATED, whether it is an accidental exit or TERMINATED normally, the thread is TERMINATED. Some people call this state TERMINATED.
18 When can a Java program cause a deadlock? How to locate and repair?
A deadlock is a specific program state in which entities are kept waiting for each other due to cyclic dependencies, and no one can move forward. Deadlocks occur not only between threads, but also between processes that have monopolized resources. In general, we focus on deadlocks in multithreaded scenarios, where two or more threads permanently block each other because they hold the locks needed by each other.
The most common way to locate deadlocks is to use tools such as JStack to retrieve thread stacks and then locate their dependencies to find deadlocks. If a deadlock is obvious, jStack and others can locate it directly, and JConsole can even perform limited deadlock detection on a graphical interface.
If a deadlock occurs while the program is running, it cannot be resolved online in most cases. You have to restart the program and fix the problem. Therefore, it is often important to review each other during code development, or to use tools for preventive checks
How can you prevent deadlocks in your programming?
Basically deadlocks occur because:
Mutually exclusive conditions, like Monitor in Java, are exclusive, either for me or for you.
Mutex conditions are held for a long time and cannot be released by themselves or preempted by other threads until the end of use. Cyclic dependencies, a chain of links where a lock occurs between two or more individuals.
The solution
The first way
If possible, avoid using multiple locks, and having synchronized or locks with nested locks only when you need them can be problematic
Second, if you must use multiple locks, try to design the lock acquisition sequence,
The third method uses methods with timeouts to bring more controllability to the program similar to Object.wait(…) Or CountDownLatch. Await (…). Both support what is known as timed_wait, where we can simply assume that the lock will not be acquired, specify a timeout, and prepare exit logic in case the lock cannot be acquired.
How do you diagnose the problem that sometimes it’s not a deadlock caused by blocking, but one thread has entered an infinite loop, causing other threads to wait?
You can run the top command in Linux to view the Java processes with high CPU usage, and then run the top-hp PID command to view the threads with high CPU usage in the Java process. Then use the jstack command to check the specific call status of the thread and troubleshoot problems
What concurrency utility classes does Java package concurrency provide?
Concurrent packaging, also known as java.util.concurrent and its subpackages, is a collection of basic Java concurrency utility classes, including several aspects:
1. Provides a variety of synchronization structures that are more advanced than synchronized, including CountDownLatch, CyclicBarrier, Semaphore, etc., and can realize richer multithreaded operations, such as using Semaphore as a resource controller. Limit the number of threads working simultaneously.
2. Various thread-safe containers, such as the most common ConcurrentHashMap, the ordered ConcunrrentSkipListMap, or the thread-safe dynamic number group CopyOnWriteArrayList through a snapshot-like mechanism.
3. Various concurrent queue implementations, such as various BlockedQueue implementations, typically ArrayBlockingQueue, SynchorousQueue, or PriorityBlockingQueue for specific scenarios, etc.
4. A powerful Executor framework that allows you to create different types of thread pools, schedule tasks, and more, eliminating the need to implement thread pools and task schedulers from scratch in most cases.
What is the difference between ConcurrentLinkedQueue and LinkedBlockingQueue?
On the difference between them in the question:
The Concurrent type is lock-free and generally provides high throughput in common multithreaded access scenarios. Internally LinkedBlockingQueue is locke-based and provides a waiting method for BlockingQueue.
The java.util.concurrent package provides a Queue, a List, a Set, and a Map for a concurrent, CopyOnWrite, and a Blocking* container. The java.util.concurrent package provides a Queue, a List, a Set, and a Map for a concurrent, CopyOnWrite, and a Blocking* container.
The Concurrent type does not have the relatively heavy modification overhead of a container like CopyOnWrite. However, everything comes at a cost, and Concurrent tends to provide lower traversal consistency. You can think of weak consistency as, for example, when traversing an iterator, the iterator can continue traversing if the container changes.
And weak consistency, is I introduced the synchronization of common behavior “fast – fail” containers, container is detected in the process of traversing the change happens, it throws ConcurrentModifcationException, no longer continue to traverse. Another manifestation of weak consistency is that the accuracy of operations such as size is limited and may not be 100% accurate. At the same time, the performance of reading is uncertain.
What kinds of thread pools are provided by the Java Concurrency library? What are their characteristics?
Executors currently offers 5 different thread pool creation configurations:
NewCachedThreadPool (), which is a thread pool used to process a large number of short-duration work tasks, has several distinctive features: it tries to cache threads and reuse them, creating new worker threads when no cached threads are available; If the thread is idle for more than 60 seconds, it is terminated and removed from the cache. This thread pool, when it’s idle for a long time, doesn’t consume any resources. Internally, it uses SynchronousQueue as the work queue.
NewFixedThreadPool (int nThreads) reuses a specified number of threads (nThreads). Behind this is an unbounded work queue, with a maximum of nThreads being active at any one time. This means that if the number of tasks exceeds the number of active queues, they will wait in the work queue for an idle thread to appear. If a worker thread exits, new worker threads will be created to make up the specified number of nThreads.
NewSingleThreadExecutor (), which operates on an unbounded work queue with a limited number of worker threads, ensures that all tasks are executed sequentially, at most one task is active, and does not allow the user to alter the thread pool instance, thereby preventing it from changing the number of threads
NewSingleThreadScheduledExecutor () and newScheduledThreadPool (int corePoolSize), create a ScheduledExecutorService, Scheduled or periodic work scheduling can be performed, depending on whether there is a single worker thread or multiple workers.
NewWorkStealingPool (int Parallelism), a thread pool that is often overlooked. Java 8 has added a ForkJoinPool that uses work-stealing algorithms to process tasks in parallel. Processing order is not guaranteed.
What is the underlying implementation principle of AtomicInteger? How do I apply CAS operations in my own production code?
An AtomicIntger encapsulates an int and provides atomic access and update operations based on CAS (compare-and-swap) technology.
CAS represents a set of operations that take the current value, perform some operations, and attempt to update it using CAS instructions. If the current value does not change, it indicates that no other threads are concurrently modifying it, and the update succeeds. Otherwise, a different choice can arise, either to retry or to return a success or failure result
Calling Unsafe is not the best choice for most scenarios. What’s more recommended? After all, we master a technology, cool is not the purpose, not to meet the interview, we still hope to have value in the actual product.
Current Java offers two public API, can realize the CAS operations, such as made in Java. Util. Concurrent. The atomic. AtomicLongFieldUpdater
The Atomic package provides the most common Atomic data types, even related Atomic types such as references, arrays, and update manipulation tools, and is the preferred option for many thread-safe programs.
This can and is recommended using the Variable Handle API, which provides a fine-grained common underlying API.
Famous ABA problems, which are usually only exposed under lock-free algorithms. As I said before, CAS compares the previous value during the update. If the other party is just exactly the same, for example, when A -> B -> A is updated, only judging that the value is A may lead to unreasonable modification operations. In this case, Java provides an AtomicStampedReference utility class to ensure the correctness of CAS by establishing a stamp version number for a reference
Describe the class loading process. What is the parent delegate model?
The Java class loading process is divided into three main steps: loading, linking, and initialization
loading
Loading, in which Java reads bytecode data from various data sources into the JVM and maps it to JVM approved data structures (Class objects). In this case, the data sources can be in various forms, such as JAR files, Class files, or even network data sources. If the input data is not structured as a ClassFile, a ClassFormatError is raised.
The loading stage is the user participation stage, we can customize the class loader, to achieve their own class loading process
link
This is the core step, simply translating the original class definition information smoothly into the JVM running process. This can be further broken down into three steps:
Verifcation is a cation for virtual machine security. The JVM needs to verify that byte messages are compliant with the Java VIRTUAL Machine specification. Otherwise, this is considered a VerifyError, preventing malicious or non-compliant messages from endangering the JVM. The validation phase may trigger loading of more classes.
Static variables in a class or interface are created and their initial values are initialized. However, this “initialization” is different from the following explicit initialization phase in that the emphasis is on allocating the required memory space and not executing further JVM instructions.
Resolution, in which symbolic references in the constant pool are replaced with direct references. In the Java Virtual Machine specification, classes, interfaces, methods, and fields are explained in detail.
Initialize the
This step actually performs the code logic for class initialization, including the action of static field assignment and the logic inside the static initialization block in the class definition. The compiler sorts out this logic at compile time. The initialization logic for the parent type takes precedence over the logic for the current type
Parental delegation model
When a class-loader tries to load a certain type, it tries to delegate this task to the parent of the current Loader unless the parent Loader cannot find the corresponding type. The purpose of using the delegate model is to avoid reloading Java types
24 What are some ways to dynamically generate a Java class at run time?
Java class source analysis, the usual development process is that developers write Java code, call JavAC to compile into a class file, and then load the JVM through the class loading mechanism, and then become Java classes that can be used by the application runtime.
Generate the bytecode directly and hand it to the classloader to load
ASM, Javassist, and Cglib generate bytecode files
Talk about partitioning JVM memory regions. Which regions are likely to cause OutofMemoryErrors?
Program Counter Register (PC). In the JVM specification, each thread has its own program counter, and only one method is executing per thread at any one time, known as the current forward method. The program counter stores the JVM instruction address of the Java method being executed by the current thread; Or, if you are executing a local method, undefned.
Java Virtual Machine Stack (Java Virtual Machine Stack), also known as the Java Stack. Each thread creates a virtual Stack with Stack frames that correspond to each Java method call.
The Heap, which is the core area of Java memory management, is used to place Java object instances. Almost all Java object instances created are allocated directly on the Heap. The heap is shared by all threads, and parameters like “Xmx” that we specify at virtual machine startup are used to specify metrics like maximum heap space
As a matter of course, the heap is also the area of the garbage collector’s attention, so the heap space is further subdivided by different garbage collectors, most famously by new generation and old generation.
Method Area This is also an Area of memory shared by all threads for storing so-called Meta data, such as class structure information, along with the corresponding runtime constant pool, fields, Method code, and so on. Due to the early Hotspot JVM implementations, many people used to refer to method sections as Permanent Generation. Oracle JDK 8 removed permanent generation and added Metaspace
The run-time Constant Pool, which is part of the method area. If you look closely at the decompiled class file structure, you can see all sorts of information about version numbers, fields, methods, superclasses, interfaces, and the constant pool. Java’s constant pool can store a wide range of constant information, from compile-time literals to symbolic references that need to be determined at run time, so it stores a wider range of information than a symbol table in a typical language.
Native Method Stack. It is very similar to the Java virtual machine stack in that it supports calls to local methods, and each thread creates one. In the Oracle Hotspot JVM, the native method stack and the Java virtual machine stack are in the same area, which is entirely up to the implementation of the technology and not mandated by the specification.
Memory consists of four main blocks: heap memory, stack memory, data segment, and codesegment.
Heap memory holds objects that are new, and objects that are new only contain member variables.
In stack memory: local member variables are stored. For primitive data types, the value of the primitive variable is stored, while for object variables, the address of heap memory is stored.
Static and constant areas: store static variables (class variables) or constants.
Method area: holds the methods of the object. So even if you create multiple objects, there’s only one method.
How do I monitor and diagnose JVM in-heap and off-heap memory usage?
Connect to a Java process, and you can track memory usage in a graphical interface. (JConsole)
Command-line tools for runtime queries, such as Jstat and Jmap, provide options for viewing heap, method area, and other usage data.
Java EE servers such as Tomcat and Weblogic are used. These servers also provide functions related to memory management
What is the structure inside the heap
1. The new generation
The new generation is where most objects are created and destroyed, and in normal Java applications, most object lifetimes are short-lived. It is divided into Eden area as the initial allocation area of the object. Two Survivor regions, sometimes called from, to, are used to place objects saved from the Minor GC.
The JVM randomly selects a Survivor region as the “to” and then performs inter-zone copying during GC, in which the surviving objects in Eden and objects in the FROM region are copied to the “to” region. The main purpose of this design is to prevent memory fragmentation and further clean up useless objects.
The Eden region continues to be partied from a memory model rather than a garbage collection perspective. Hotspot JVM also has a concept called Thread Local Allocation Bufer (TLAB), which is provided by all JVMS derived from OpenJDK as far as I know.
2. The old age
Objects with long life cycles are usually copied from Survivor regions. Of course, there are special cases where we know that ordinary objects will be assigned to tLabs; If the object is large, the JVM will try to allocate it directly elsewhere in Eden. If the object is too large to find enough contiguous free space in the new generation, the JVM allocates it directly to the old generation.
3. The permanent generation
This is how the method area was implemented in the early Hotspot JVMS, storing Java class metadata, constant pools, and Intern string caches, but the persistent generation area has been removed since JDK 8.
1. Most object creation takes place in Eden, except for a few large objects.
Before the Minor GC starts, to-survivor is empty and from-survivor is created by the object.
3. After the Minor GC, all the surviving objects of Eden are copied to to-survivor, and the surviving objects of from-survivor are also copied to to-survivor. Where the age of all objects is +1
4, from-survivor clears and becomes a new to-survivor. The to-survivor with an object becomes a new from-survivor. Repeat to Step 2
27 What are the common Java garbage collectors?
Serial GC, The oldest garbage collector, is “Serial” in that its collection is single-threaded and goes into The notorious stop-the-world state during garbage collection. Of course, its single-threaded design also means a streamlined GC implementation with no complex data structures to maintain and easy initialization, so it has always been the default option for JVMS in Client mode. In chronological terms, its older implementation is often referred to as Serial Old alone, which uses a mark-compact algorithm to distinguish it from the new generation of replication algorithms.
The corresponding JVM parameters for the Serial GC are: -xx :+UseSerialGC ParNew GC is a multithreaded version of SerialGC. It is most commonly used to work with older CMS GC. -xx :+UseConcMarkSweepGC -xx :+UseParNewGC
The CMS (Concurrent Mark Sweep) GC is based on the mark-sweep algorithm and is designed to minimize downtime, which is important for reactive time-sensitive applications such as the Web and is still used by many systems today. However, the mark-sweep algorithm used by CMS has memory fragmentation issues, so it is difficult to avoid full GC in cases such as long runs, resulting in bad pauses. In addition, with the emphasis on Concurrent, the CMS will consume more CPU resources and compete with user threads.
Parrallel GC, also known as throughput first GC, was the default GC choice for server mode JVMS in previous JDK releases such as JDK 8. Its algorithm is similar to Serial GC, although the implementation is much more complex, with the feature that both the new generation and the old generation GC run in parallel, making it more efficient in a common server environment. -xx :+UseParallelGC In addition, ParallelGC introduces developer-friendly configuration items. We can directly set pause times or throughput targets, which the JVM will automatically adjust, such as the following parameters: -xx :MaxGCPauseMillis=value -xx :GCTimeRatio=N // Ratio of GC time to user time = 1 / (N+1)
G1 GC this is a throughput and pause time GC implementation that is the default GC option in the Oracle JDK 9 and later. G1 can intuitively set pause time goals. Compared with CMS GC, G1 may not be able to achieve delayed pauses in CMS at best, but it is much better at worst. The G1 GC still has the concept of years, but its memory structure is not a simple strip partition, but rather a chessboard of regions. Region to Region is a copy algorithm, but the overall algorithm is actually a MarkCompact algorithm, which effectively avoids memory fragmentation, especially when the Java heap is very large. G1 throughput and pauses are very good and still improving, while CMS has been marked deprecated in JDK 9, so the G1 GC is worth getting to grips with.
Principles and basic concepts of garbage collection
First, the premise of automatic garbage collection is to know which memory can be freed. Consider this in light of my previous analysis of Java class loading and memory structures. There are two main things, the main thing is the object instance, which is stored on the heap; Then there is the metadata and other information in the method area, such as the type that is no longer used, and it seems reasonable to unload the Java class. For object instance collection, there are two basic algorithms, reference counting and reachability analysis.
The reference counting algorithm, as its name implies, adds a reference count to an object, which records when the object is referenced. If the count is 0, the object is recyclable. This is the recycle of choice for many languages, such as Python, which is gaining popularity due to artificial intelligence, and supports both reference counting and garbage collection. Which is optimal depends on the scenario, and there are large-scale attempts to retain only reference counting mechanisms in practice to improve throughput.
Java did not choose reference counting because of the basic difficulty of handling circular reference relationships. Then there is the reachabness of Java’s selection. Java’s various reference relationships further complicate the reachabness issue in a way that can be described in column 4. This type of Garbage Collection is often referred to as Tracing Garbage Collection. To put it simply, the object and its reference relationship are regarded as a graph, the active object is selected as GC Roots, and then the reference chain is tracked. If an object is unreachable from GC Roots, that is, there is no reference chain, then it can be considered as a recyclable object. The JVM takes as GC Roots the objects being referenced in the virtual machine stack and the local method stack, objects referenced by static properties, and constants.
The collection of useless metadata in the method area is more complicated, so I’ll briefly comb through it. Remember my classification of class loaders, generally there is no class unload for initializing the type loaded by class loaders; Common type offloading, often requires the corresponding custom class loader itself to be recycled, so a lot of use of dynamic type occasions, need to prevent the metadata area (or early permanent generation) will not OOM. After eight u40 JDK, below is the default parameters: – XX: + ClassUnloadingWithConcurrentMark
Second, I think it is enough to have a general understanding of the common garbage collection algorithms, and understand the corresponding principles and advantages and disadvantages. They can be divided into three categories:
Copying algorithms, the new generation GC I talked about earlier, are basically based on Copying algorithms, Copying live objects into the TO region as described in the last lecture, placing objects sequentially during Copying to avoid memory fragmentation. The cost of doing this is that since you want to copy, you need to reserve memory space in advance, which is a certain waste; In addition, for a GC like G1 that splits into a large number of regions, replication rather than movement means that the GC needs to maintain object reference relationships between regions, which can be costly, either in terms of memory footprint or time.
The Mark-sweep algorithm does the tagging, identifies all objects to be reclaimed, and then sweeps. In addition to the limited efficiency of the tagging and cleaning process, fragmentation is inevitable, which makes it unsuitable for very large heaps. Otherwise, once a Full GC occurs, the pause time may be unacceptable.
Mark-compact is similar to mark-clear, but to avoid memory fragmentation, objects are moved during the clean up process to ensure that the moved objects occupy continuous memory space. Note that these are just the basic algorithm ideas, the actual GC implementation process is much more complex, currently under development frontier GC are composite algorithms, and both parallel and concurrent.
Talk about your GC tuning ideas
When it comes to tuning, it has to be a scenario-specific, purpose-specific thing, and for GC tuning, the first thing you need to know is what you’re tuning for. From the perspective of performance, we usually focus on three aspects: footprint, latency and throughput. In most cases, optimization will focus on one or two of them, and there are few cases that can take into account three different perspectives. Of course, in addition to the usual three aspects above, other GC related scenarios may need to be considered. For example, OOM may also be associated with unreasonable GC related parameters; Or, for application startup speed requirements, GC can also be a consideration.
The basic tuning idea can be summarized as follows:
Understand application requirements and problems and determine tuning goals. Suppose we develop an application service, but find that performance jitter occasionally occurs and there are long service pauses. Evaluate acceptable response times and traffic volumes for users, simplifying the goal to keep GC pauses to less than 200ms and maintain a standard of throughput.
Understand the state of the JVM and GC to locate specific problems and determine that GC tuning is really necessary. There are many ways to do this, such as viewing GC status through tools such as Jstat, enabling GC logging, or using diagnostic tools provided by the operating system. For example, by tracking the GC logs, you can find out if the GC paused for a long time at a particular time, causing the application to be unresponsive.
Here, we need to consider whether the selected GC type conforms to our application characteristics. If so, what is the specific problem? Is it Minor GC that is too long, or Mixed GC that has abnormal pauses? If not, consider which classes to switch to, such as CMS and G1, which are more focused on low-latency GC options.
Determine the parameters or software and hardware configurations to be adjusted.
Verify that the tuning objectives are met, and if so, consider ending the tuning; Otherwise, repeat the analysis, adjustment, and validation process.
28 What happens -before in the Java memory model
The Happen-before relationship is a mechanism in the Java memory model to ensure visibility of multithreaded operations, and it is also a precise definition of the vague concept of visibility in the earlier language specifications.
Its specific manifestations, including but far more than our intuitive aspects of synchronized, volatile, lock operation sequence, such as:
Every operation performed in the thread is guaranteed to happen before the next operation, which guarantees the basic procedural order rules that developers use to write their programs.
For volatile variables, writes to it are guaranteed to happen-before subsequent reads of the variable. For a lock unlock operation, ensure happen-before lock operation.
When the object is built, be sure to happen-before the start action of the Fnalizer.
Even completion of similar operations within threads is guaranteed to happen before other thread.join () threads, etc.
These happen-before relationships are transitive. If a happen-before B and B happen-before C are satisfied, then a happen-before C is also valid.
I have always used happening-before, rather than before or after, because it not only guarantees execution time, but also guarantees the order of memory read and write operations. Just the order of the clock does not guarantee thread interaction
Why is the JMM needed, and what problem is it trying to solve? What is the memory model
In order to ensure the concurrency programming can meet the atomicity, visibility and order. There is an important concept, and that is the memory model.
In order to ensure the correctness (visibility, orderliness, atomicity) of shared memory, the memory model defines the specification of read and write operation of multithreaded program in shared memory system. These rules are used to regulate the read and write operations of memory, so as to ensure the correctness of instruction execution. It’s about the processor, it’s about the cache, it’s about concurrency, it’s about the compiler. It solves the memory access problems caused by multi-level CPU cache, processor optimization and instruction rearrangement, and ensures the consistency, atomicity and order in concurrent scenarios.
The memory model solves the concurrency problem in two main ways: limiting processor optimization and using memory barriers. This article will not go into the underlying principles to introduce, interested friends can learn by themselves.
What is the Java Memory model
Java programs need to run on the Java virtual machine, Java Memory Model (JMM) is a kind of Memory Model specifications, shielding the access differences of various hardware and operating systems. A mechanism and specification that ensures consistent access to memory by Java programs on various platforms.
The Java memory model stipulates that all variables are stored in the main memory, and each thread has its own working memory. The working memory of the thread stores a copy of the main memory of the variables used in the thread. All operations on variables must be carried out in the working memory of the thread, instead of reading and writing the main memory directly. Different threads cannot directly access variables in each other’s working memory, and the transfer of variables between threads requires data synchronization between their own working memory and main memory.
The JMM is used to synchronize data between working memory and main memory. It specifies how and when to synchronize data.
JAVA refers to main memory and working memory in a simple analogy to the computer memory model of main memory and cache. In particular, it is important to note that main and working memory are not directly analogous to the Java heap, stack, method area, and so on in the JVM’s memory structure. In Understanding the Java Virtual Machine, the main memory corresponds primarily to the object instance data part of the Java heap, in terms of definitions of variables, main memory, and working memory, if at all. Working memory corresponds to a portion of the virtual machine stack.
So, to summarize, the JMM is a specification that addresses the problems of local memory inconsistencies, compiler reordering of code instructions, and out-of-order code execution by processors when multiple threads communicate through shared memory.
Implementation of the Java memory model
Those of you familiar with Java multithreading know that Java provides a series of keywords related to concurrent processing, such as volatile, synchronized, final, concurren packages, and so on. These are the keywords that the Java memory model provides to programmers by encapsulating the underlying implementation.
When developing multithreaded code, we can directly use keywords like synchronized to control concurrency and never need to worry about underlying compiler optimizations, cache consistency, and so on. Therefore, the Java memory model, in addition to defining a set of specifications, provides a set of primitives that encapsulate the underlying implementation for developers to use directly.
This article does not attempt to cover the use of all keywords individually, as there is much information available on the web about the use of individual keywords. Readers can learn for themselves. Another important point of this article is that, as we mentioned earlier, concurrent programming should solve the problems of atomicity, orderliness, and consistency. Let’s take a look at the methods used to ensure this in Java.
atomic
In Java, two high-level bytecode instructions, Monitorenter and Monitorexit, are provided to ensure atomicity. Synchronized is the key word corresponding to these two bytecodes in Java.
Therefore, synchronized can be used in Java to ensure that operations within methods and code blocks are atomic.
visibility
The Java memory model relies on main memory as a transfer medium by synchronizing the new value back to main memory after a variable is modified and flushing the value from main memory before the variable is read.
The volatile keyword in Java provides the ability to synchronize modified variables to main memory immediately after they are modified, and to flush variables from main memory each time they are used. Therefore, volatile can be used to ensure visibility of variables in multithreaded operations.
In addition to volatile, the Java keywords synchronized and final are also visible. It’s just implemented in a different way. It’s not expanded anymore.
order
In Java, synchronized and volatile can be used to ensure order between multiple threads. Implementation methods are different:
The volatile keyword disallows instruction reordering. The synchronized keyword ensures that only one thread can operate at a time.
Ok, so this is a brief introduction to the keywords that can be used to solve atomicity, visibility, and order in Java concurrent programming. As readers may have noticed, the synchronized keyword seems to be all-purpose, satisfying all three of these attributes at once, which is why so many people abuse synchronized.
Synchronized, however, is a performance deterrent, and while the compiler provides many lock optimization techniques, overuse is not recommended.
How does the JMM address issues such as visibility?
Implementations within the JMM typically rely on so-called memory barriers to provide memory visibility guarantees by disallowing certain resorts, which implements various coincidentally -before rules.
JMM can be understood from four dimensions:
1 from the perspective of JVM runtime, JVM memory can be divided into JVM stack, local method stack, PC counter, method area, heap; The first three areas are private to the thread, while the last two are common to all threads
2 From the perspective of JVM memory functions, JVMS can be divided into heap memory, non-heap memory, and others. Where the heap memory corresponds to the above heap area; Non-heap memory corresponds to the ABOVE JVM stack, local method stack, PC counter, method area; Others correspond to direct memory
From a thread running perspective, the JVM can be divided into main memory and thread working memory. The Java memory model specifies that all variables are stored in main memory; The working memory of each thread holds the variables used by the thread. These variables are copies of the main memory. All operations (reading, assigning, etc.) on variables must be performed by the thread in the working memory instead of reading or writing variables in the main memory
From a garbage collection perspective, the heap area in the JVM = New generation + old age. Cenozoic is mainly used to store newly created objects and objects with short survival time. Cenozoic =E+S1+S2; The old age is used to store objects with a long life
Injection attacks in Java application development
1. SQL injection
Select * from use_info where username = “input_usr_name” and password = “or” =”
2. The operating system commands are injected
XML injection attacks
Java apis and tools form the foundation of Java security
First, the runtime security mechanism.
Second, Java provides a security framework API, which is the foundation for building applications such as secure communication. For example: encryption, decryption API.
Authorization and authentication API.
Secure communication related libraries, such as basic HTTPS communication protocol related standard implementation, such as TLS 1.3; Or affiliated similar certificate Revocation status determination (OSCP) and other protocol implementation. Note that the internal implementation of this part of the API is vendor-specific, and different JDK vendors often customize their own encryption algorithm implementations.
Third, there are various security tools integrated with the JDK, such as:
Keytool, a powerful tool for managing keys, certificates, etc. that are essential for security scenarios, and for managing keystore files used by Java programs.
Jarsigner, used to sign or validate JAR files.
30 Talk about transaction isolation levels supported by MySQL and the principles and application scenarios of pessimistic and optimistic locking
The so-called Isolation Level is a definition proposed to ensure the correctness of concurrent data reads and writes in database transactions. It is not a unique concept of MySQL, but derived from THE SQL-92 standard developed by ANSI/ISO. Each relational database provides its own unique isolation level implementation, and although locks are commonly defined as the implementation unit, the actual implementation varies widely. Take the most common MySQL InnoDB engine for example. It is a composite implementation based on MVCC (Multi-versioning Concurrency Control) and lock. From low to high, there are four different levels of MySQL transaction isolation:
Read Uncommitted is when a transaction can see changes that have not been committed by another transaction, which is the lowest level of isolation and allows dirty reads to occur.
Read Committed, a transaction can only see data that has been modified by other transactions, ensuring that no intermediate states are seen and dirty reads are not committed. Read committed is still a relatively low level of isolation and does not guarantee that the same data will be retrieved when Read again, that is, other transactions will be allowed to modify the data concurrently and non-repeatable reads and Phantom reads will be allowed.
Repeatable reads (Repeatable reads) is the default isolation level of MySQL InnoDB engine, but unlike some other database implementations, it can be simply considered that MySQL does not have illusion reads at Repeatable reads level.
Serializable, which means that concurrent transactions are serialized between each other, usually meaning that reads need to acquire shared read locks, updates need to acquire exclusive write locks, and if SQL uses WHERE statements, interval locks are acquired
(MySQL implements GAP locks, which are also used by default in repeatable read levels), which is the highest isolation level. Pessimistic locking and optimistic locking are not concepts unique to MySQL or databases, but basic concepts of concurrent programming. The main difference is that when handling shared data, “pessimistic lock” assumes that data conflict is more likely, while “happy lock” assumes that most cases will not conflict, and decides whether to take exclusive measures.
MySQL database application development, pessimistic locking is generally used like SELECT… Statements such as FOR UPDATE that lock data to prevent other transactions from accidentally modifying the data. Optimistic lock is similar to AtomicFieldUpdater in Java, which also uses CAS mechanism and does not lock data. Instead, it compares the timestamp or version number of data to determine the version required by optimistic lock.
In my opinion, the MVCC mentioned earlier is essentially an optimistic locking mechanism, while exclusive read-write locks, two-phase locks, and so on are pessimistic locking implementations. For their application scenarios, you can build a simplified system for querying and purchasing train tickets. At the same time, there may be a lot of people searching. Although the specific seat ticket can only be sold to one person, there may be a lot of remaining tickets, and it is not possible to predict which person will buy the ticket, so optimistic lock is more suitable for this situation.