preface

Whether you’re working on Javaee or Android development, the basics of the JDK are especially important. We use ArrayList, HashMap, etc., a lot in our code, but we rarely think about why we use it or what we need to pay attention to when we use it. You might even go to an interview and be embarrassed when someone asks you how HashMap works and only knows put and get.

So in order to develop higher quality programs, write better code, or need to take a good look at some key JDK source code. This article mainly on the JDK for some important knowledge of combing and sorting, easy to learn and review.

Basic knowledge of

Underlying data types

A variable is an application for memory to store values. That is, when creating variables, you need to allocate space in memory. The memory management system allocates storage space for variables based on their types. The allocated space can only be used to store data of this type

type position The default value
byte 8 (1 byte) 0
short 16 (2 bytes) 0
int 32 (4 bytes) 0
long 64 bytes (8) 0L
float 32 (4 bytes) 0.0 f
double 64 bytes (8) 0.0 d
boolean 1 false
char 16-bit Unicode characters “”

Equal HashCode ==

= = Memory address comparison
equal Object Default memory address comparison, usually need to be overwritten
hashcode Object is used to hash collections. Object is a memory address by default and is not set unless used for hash collections.

The general convention of the hashCode method, which states that equal objects must have equal hash codes. When the equals method is overridden, it is often necessary to override the hashCode method. But hashCode equals, not necessarily equals ()

The difference between String, StringBuffer and StringBuilder.

The Java platform provides two types of strings: String and StringBuffer/StringBuilder, which can store and manipulate strings. String is read-only, which means that the contents of the String referenced by String cannot be changed. The StringBuffer and StringBulder classes represent string objects that can be modified directly. Introduced in JDK1.5, StringBuilder is exactly the same method as StringBuffer, except that it is used in a single-threaded environment and is slightly more efficient than StringBuffer because all aspects of it are not modified by synchronized.

Java four kinds of references, weak and weak virtual, used in the scene.

There were only strong references before JDK1.2, the other references were introduced after.

Strong Reference: Object obj = new Object(); . As long as a strong reference exists, it must not be collected during GC.

Soft references are used to describe objects that are useful but not necessary. When the heap is about to get OOM (Out Of Memory), the heap will reclaim the Memory space pointed to by the Soft Reference. If the heap is still Out Of Memory, the heap will throw OOM. It is commonly used to implement memory sensitive caching. After the real object is called with the finalizable and Finalize () methods and memory has been cleaned up, the SoftReference Object is added to its ReferenceQueue if it still exists. The Soft Reference and Weak Reference get methods will return NULL only after the previous steps have been completed

Weak references The Weak Reference space must be reclaimed when GC occurs. The time when the soft reference is enqueued is the same

A Phantom Reference, also known as a Phantom Reference, does not affect the life cycle of an object and cannot be used to obtain an object instance. It is only used to receive a system notification when GC occurs. When the Finalize method of an object has been called, the ghost reference of the object will be queued. Check the contents of the queue to see if an object is ready to be reclaimed. Unlike soft and weak references, virtual references are queued when memory is not cleared. A virtual reference must be passed to the reference queue, but nothing else

Java Collections Framework





Java collection class structure diagram

Collection is a highly abstract interface for lists, sets and other sets. It contains the basic operations of these sets. It is mainly divided into two parts: List and Set.

The List interface typically represents a List (array, queue, LinkedList, stack, etc.) whose elements can be repeated. The common implementation classes are ArrayList and LinkedList, and the less common Vector. In addition, the LinkedList implements the Queue interface, so it can also be used as a Queue.

The Set interface typically represents a collection whose elements are not allowed to be duplicated (guaranteed by hashcode and equals). Common implementation classes include HashSet and TreeSet. HashSet is implemented through a HashMap in a Map. TreeSet is implemented through TreeMap in the Map. In addition, TreeSet implements the SortedSet interface, so it is an ordered collection (the elements in the collection implement the Comparable interface and override the Compartor function). Abstract classes AbstractCollection, AbstractList, and AbstractSet implement the Collection, List, and Set interfaces, respectively. These are many of the adapter design patterns used in the Java Collections framework. Implement some or all of the methods in the interface in an abstract class, so that some of the following classes can simply inherit the abstract class directly and implement their own methods instead of implementing all of the abstract methods in the interface.

Map is a mapping interface in which each element is a key-value pair. AbstractMap also implements most of the functions in Map interface through adapter pattern. TreeMap, HashMap, WeakHashMap and other implementation classes are implemented by inherits AbstractMap. In addition, HashTable, which is not commonly used, directly implements Map interface. It and Vector are set classes introduced in JDK1.0.

Iterator is an Iterator that iterates over a Collection. Each implementation of a Collection implements the Iterator () function, which returns an Iterator object that iterates over the Collection. ListIterator is dedicated to traversing lists. Enumeration, which was introduced in JDK1.0, does the same but less well than Iterator. It can only be used in hashtables, vectors, and stacks.

Arrays and Collections are utility classes for manipulating Arrays and Collections. For example, ArrayList and Vector use the array.copyof () method extensively. There are many static methods in Collections that can return synchronized, thread-safe, versions of collection classes. Of course, for thread-safe combination classes, Concurrent is preferred and the corresponding collection classes under the package are issued.

Collection List Set Map difference

interface Whether to order Allow elements to repeat
collection no is
List is is
AbstractSet no no
HashSet no no
TreeSet Yes (sort by binary tree) no
AbstractMap no Key-value is used to map and store data. The key must be unique and the value can be repeated
HashMap no Key-value is used to map and store data. The key must be unique and the value can be repeated
TreeMap Yes (sort by binary tree) To use key-value to map and store data, the key must be unique, value

Common set analysis

A collection of The main algorithm Source code analysis
ArrayList Array-based List that encapsulates dynamically growing Object[] arrays Huangjunbin.com/2016/11/14/…
Stack Is a subclass of Vector, stack structure (last in first out) Huangjunbin.com/2016/11/14/…
LinkedList Implement List, Deque; Implement List, can carry out queue operation, can access collection elements randomly through the index; Implement a Deque, which can also be used as a double-endian queue or as a stack Huangjunbin.com/2016/11/14/…
HashMap Hash – based Map interface implementation, the use of sequential storage and chain storage structure Huangjunbin.com/2016/11/24/…
LinkedHashMap LinkedHashMap is a subclass of HashMap. It has the same storage structure as HashMap, but it adds the head node of a bidirectional linked list, linking all nodes put to LinkedHashMap into a bidirectional circular list, so it retains the insertion order. The output order of nodes can be the same as the input order Github.com/francistao/…
TreeMap TreeMap is an implementation of the red-black tree algorithm and supports sorting Blog.csdn.net/chenssy/art…

concurrent

Lists

  • ArrayList – Based on generic arrays
  • LinkedList – Not recommended
  • Vector — Deprecated

CopyOnWriteArrayList – rarely updated, often used for traversal

Queues / deques

  • ArrayDeque – Based on generic arrays
  • Stack — deprecated
  • PriorityQueue – The contents of the read operation are sorted
  • ArrayBlockingQueue – A blocking queue with boundaries
  • ConcurrentLinkedDeque/ConcurrentLinkedQueue — Unbounded linked list queue (CAS)
  • DelayQueue – Queues with elements with delays
  • LinkedBlockingDeque/LinkedBlockingQueue – Linked list queue (with lock), can be set to have borders or not
  • LinkedTransferQueue – LinkedTransferQueuetransferW/O storage
  • PriorityBlockingQueue – concurrent PriorityQueue
  • SynchronousQueue – Uses the Queue interface for sanorecovery

Maps

  • A HashMap, general Map
  • EnumMap — The key uses enum
  • Hashtable — Deprecated
  • IdentityHashMap — the key uses == for comparison
  • LinkedHashMap — Maintains the insertion order
  • TreeMap — The keys are sorted
  • WeakHashMap — for cache (cache)
  • ConcurrentHashMap — Generic concurrent Map
  • ConcurrentSkipListMap – sorted concurrent Map

Sets

  • HashSet, general set
  • Enumsets – enum Set
  • BitSet — A Set of bits or dense integers
  • LinkedHashSet — Maintains insertion order
  • TreeSet, sorting the Set
  • ConcurrentSkipListSet — Sort concurrent sets
  • CopyOnWriteArraySet – updates almost nothing, usually just traversal

conclusion

The choice of the Set

  1. Hashsets always perform better than TreeSet (especially for the most common operations like adding and querying elements) because TreeSet requires additional red-black tree algorithms to maintain the order of collection elements. You should use TreeSet only when you need a Set that keeps sorting, otherwise you should use HashSet
  2. LinkedHashSet is slightly slower than HashSet for normal insert and delete operations due to the overhead of maintaining a linked list. However, traversing a LinkedHashSet is faster thanks to linked lists
  3. EnumSet is the best performing of all Set implementation classes, but it can only hold enumeration values of the same enumeration class as Set elements
  4. Hashsets, TreeSet, and Enumsets are “thread-unsafe” and can usually be “wrapped” by the synchronizedSortedSet method of the Collections tool class. SortedSet s = Collections.synchronizedSortedSet(new TreeSet(…) );

The List to select

  1. The List provided by Java is a “linear table interface”, and ArrayList and LinkedList are two typical implementations of linear tables
  2. Queue represents a Queue, and Deque represents a double-ended Queue (which can be used as either a Queue or a stack)
  3. Because arrays hold all their elements in a contiguous chunk of memory, they perform best when accessed randomly. All collections with arrays as the underlying implementation perform best when randomly accessed.
  4. Internal linked list as the underlying implementation of the collection of insert and delete operations have good performance
  5. When iterating, collections implemented with linked lists as the bottom layer perform better than collections implemented with arrays as the bottom layer
  6. Use LinkedList when you want a lot of inserts and deletions. ArrayList is used when fast random access is needed;

The choice of the Map

  1. HashMap and Hashtable are about equally efficient because their implementation mechanism is almost identical. But HashMap is generally a bit faster than Hashtable because Hashtable requires additional thread synchronization control
  2. TreeMap is usually slower than HashMap or Hashtable (especially when inserting or deleting key-value pairs) because TreeMap uses red-black trees to manage key-value pairs
  3. One advantage of using TreeMap is that the key-value pairs in TreeMap are always in an ordered state and do not require special sorting operations
  4. HahMap uses hashCode for lookup, while TreeMap holds some sort of ordered state
  5. Therefore, for insert, delete, and locate operations, HashMap is the best choice; If you want to sort by nature or custom, choose TreeMap

reference

Java Learning Collections

Overview of Java collections

JavaSE (Java based)