Let’s move on to article 16 of the JDK series

PriorityQueue.java

This is the third article in the Java Collections Framework. The first two parses ArrayList and LinkedList, which are implemented based on dynamic arrays and linked lists, respectively. PriorityQueue is a Java PriorityQueue, which is implemented based on the heap. The concept of heap will be introduced later.

An overview of the

PriorityQueue is an unbounded PriorityQueue based on a heap implementation. The order of the elements in the priority queue is based on their natural order or on the comparators provided in the constructor. Null elements are not allowed and non-comparable elements are not allowed to be inserted (Comparable is not implemented). It is not thread-safe, and the JDK also provides a thread-safe priority queue, PriorityBlockingQueue.

Let’s focus on the heap-based priority queue. So what is a queue first? What is a heap?

The queue

Queues are easy to understand. They are a special kind of linear list. For example, the canteen line is a queue, the first line of the first meal, then the students in the queue at the end of the line, hit the meal of the students from the head to leave, this is a typical first-in, first-out (FIFO) queue. Queues generally provide two basic operations: joining the queue at the end of the queue and leaving the queue at the head. The superclass interface of a Queue in Java is Queue. Let’s look at the UML diagram of Queue, which gives us the basic methods:

  • add(E) :Add elements at the end of the queue
  • offer(E) :Add elements at the end of the queue. Behaves the same as add() on a capacity constrained queue.
  • remove() :Delete and return the queue header, or throw an exception if the queue is empty
  • poll() :Delete and return the queue header, or null if the queue is empty
  • Element () :Return the queue header, but not delete it, and throw an exception if the queue is empty
  • peek() :Returns the queue header, but does not delete it, or null if the queue is empty

Basically, it’s a subdivision of the team exit and team entry operations. PriorityQueue is a PriorityQueue that sorts elements in natural order or by the provided Comparator. Heapsort is used here, so the PriorityQueue is implemented heap-based. If you understand the concept of a heap, you can skip the next section. If you don’t know what a heap is, read the next section carefully, otherwise you won’t be able to understand the source code for PriorityQueue.

The heap

Heap is actually a special binary tree, which has the following two characteristics:

  • The heap is a complete binary tree
  • Each node in the heap must have a value less than or equal to (or greater than or equal to) the value of its children

A binary tree of height K is a complete binary tree if all the levels from 0 to k-1 are full, and all the children of the last level are on the left. A complete binary tree with an array can easily obtain its two child nodes according to the subscript of the parent node. Here is a complete binary tree:

A heap is a complete binary tree. Those with the smallest element on top are called small top heaps, and those with the largest element on top are called large top heaps. PriorityQueue is the small top heap. In contrast to the heap structure above, for any parent node, take node 5 with subscript 4 as an example, and its two child nodes have subscripts of 2*4+1 and 2*4+2 respectively. For complete binary trees and heaps, keep in mind the following conclusions, which will be used later in the source code analysis:

  • Nodes that have no children are called leaf nodes
  • The subscript fornThe subscripts of the two left and right child nodes of the parent node of2*n+12*n+2

This is the advantage of using arrays to build the heap, which allows you to quickly build heap structures based on subscripts. So much for the heap, remember that the PriorityQueue is a queue based on the heap implementation, and the heap is a full binary tree. The following is an in-depth analysis of the heap operation according to the source code of PriorityQueue.

The source code parsing

Class declaration

public class PriorityQueue<E> extends AbstractQueue<E>
    implements java.io.Serializable {}Copy the code

Member variables

private static final long serialVersionUID = -7720805057305804111L;
private static final int DEFAULT_INITIAL_CAPACITY = 11; // Default initial capacity
transient Object[] queue; // An array of queue elements
private int size = 0; // Number of queue elements
private final Comparator<? super E> comparator; 
transient int modCount = 0; // fail-fast
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8; // Maximum capacity
Copy the code
  • PriorityQueue uses arraysqueueThe default initial capacity is 11 and the maximum capacity isInteger.MAX_VALUE - 8.
  • comparatorIf null, the elements are sorted in their natural order.
  • modCountUsed to provide the fail-fast mechanism.

The constructor

PriorityQueue has seven constructors that can be divided into two classes, providing and not providing initial elements. Let’s start with a constructor that does not provide an initial element:

/* * Create a priority queue with an initial size of 11, with elements in the natural order */
public PriorityQueue(a) {
    this(DEFAULT_INITIAL_CAPACITY, null);
}

/* * Creates a priority queue with an initial capacity, with elements in natural order */
public PriorityQueue(int initialCapacity) {
    this(initialCapacity, null);
}

/* * Create a priority queue with an initial size of 11, and the elements are ordered by the given comparator */
public PriorityQueue(Comparator<? super E> comparator) {
    this(DEFAULT_INITIAL_CAPACITY, comparator);
}

/* * Creates a priority queue with an initial capacity, and elements are ordered by the given comparator */
public PriorityQueue(int initialCapacity,
                        Comparator<? super E> comparator) {
    if (initialCapacity < 1)
        throw new IllegalArgumentException();
    this.queue = new Object[initialCapacity];
    this.comparator = comparator;
}
Copy the code

This class of constructors is very simple. We simply assign values to queue and comparator. The constructor for a given initial element is not so simple, because a given initial set does not necessarily satisfy the structure of the heap, we need to construct a heap of it, a process called heapization.

PriorityQueue can be heaped directly from SortedSet and PriorityQueue, and since the initial collection is already ordered, no heapization is required. If the constructor argument is an arbitrary Collection, then heapization may be required.

public PriorityQueue(Collection<? extends E> c) {
    if (c instanceofSortedSet<? >) {// Use the SortedSet comparator directly
        SortedSet<? extends E> ss = (SortedSet<? extends E>) c;
        this.comparator = (Comparator<? super E>) ss.comparator();
        initElementsFromCollection(ss);
    }
    else if (c instanceofPriorityQueue<? >) {// Use PriorityQueue's comparator directly
        PriorityQueue<? extends E> pq = (PriorityQueue<? extends E>) c;
        this.comparator = (Comparator<? super E>) pq.comparator();
        initFromPriorityQueue(pq);
    }
    else {
        this.comparator = null;
        initFromCollection(c); // Need to heap}}Copy the code

Let’s look at the specific process of heapification:

private void initFromCollection(Collection<? extends E> c) {
    initElementsFromCollection(c); // Assign c copy directly to the queue
    heapify(); / / heap
}
Copy the code
private void heapify(a) {
    for (int i = (size >>> 1) - 1; i >= 0; i--)
        siftDown(i, (E) queue[i]); // Top down heap
}
Copy the code

The logic of heapification is short, but rich. The value of the heap can be divided into two categories: shiftUp() and top down. The value is used here as shiftDown. Can you see from the above code which node is heaped from? Instead of starting the heap from the last node, it starts from the last non-leaf node. Remember what a leaf node is? A node with no children is a leaf node. So heaping all the non-leaf nodes is enough to process all the nodes. What is the subscript of the last non-leaf node? If you can’t figure out the diagram of the heap above, the answer is size/ 2-1. The source code uses unsigned shift operation instead of division.

Let’s take the logic from shiftDown() :

/* * heap from top to bottom to ensure that x is less than or equal to a child node or that x is a leaf node */
private void siftDown(int k, E x) {
    if(comparator ! =null)
        siftDownUsingComparator(k, x);
    else
        siftDownComparable(k, x);
}
Copy the code

X is the element to insert and k is the position to fill. Use different methods depending on whether the comparator is for air conditioner. For example, if the comparator is not null:

private void siftDownComparable(int k, E x) {
    Comparable<? super E> key = (Comparable<? super E>)x;
    int half = size >>> 1;        // loop while a non-leaf
    while (k < half) { // The number of non-leaf nodes in the heap is always less than half. When k is a leaf, you just swap
        int child = (k << 1) + 1; // Left child node
        Object c = queue[child];
        int right = child + 1; // Right child node
        if (right < size &&
            ((Comparable<? super E>) c).compareTo((E) queue[right]) > 0)
            c = queue[child = right]; // Take the smaller value in the child node
        if (key.compareTo((E) c) <= 0) // Compare x and child nodes
            break; // x is larger than the child node
        queue[k] = c; // If x is smaller than the child, swap with the child
        k = child; // k is equal to child
    }
    queue[k] = key;
}
Copy the code

The logic is simple. PriorityQueue is a small top heap whose parent is always less than or equal to its children. For each leaf node, and it compares two left and right child nodes, and if the parent node is larger than two child nodes, will the parent node sinking, sinking compared before proceeding and child nodes, until the parent node is smaller than two child nodes, or the parent node is a leaf node, no children. So the cycle repeats, and the top-down heap is complete.

methods

Having looked at the constructors, let’s look at the methods provided by PriorityQueue. Since it is a queue, there must be queue in and queue out operations. Let’s start with add().

add(E e)

public boolean add(E e) {
    return offer(e);
}

public boolean offer(E e) {
    if (e == null)
        throw new NullPointerException();
    modCount++;
    int i = size;
    if (i >= queue.length)
        grow(i + 1); // Automatic capacity expansion
    size = i + 1;
    if (i == 0)
        queue[0] = e; // The first element can be assigned directly
    else
        siftUp(i, e); // Subsequent elements should be heaped to ensure heap characteristics
    return true;
}
Copy the code

The add() method calls the offer() method, which adds an element to the end of the queue. The offer() process can be divided into two steps: automatic expansion and heap.

Priority queues also support automatic expansion, but the expansion logic is different from that of ArrayList, which directly expands to 1.5 times its original size. PriorityQueue behaves differently depending on the current queue size.

private void grow(int minCapacity) {
    int oldCapacity = queue.length;
    // Double size if small; else grow by 50%
    int newCapacity = oldCapacity + ((oldCapacity < 64)? (oldCapacity +2) :
                                     (oldCapacity >> 1));
    // overflow-conscious code
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    queue = Arrays.copyOf(queue, newCapacity);
}
Copy the code
  • If the size of the original queue is smaller than 64, it is doubled and then +2
  • If the size of the original queue is greater than 64, the value is increased by 50%

The second step is heap. Why reheap elements at the end of the queue? Take a look at the graph below:

I have a heap on the left, and I’m going to add an element 4 to the end of the queue, if I add it directly to the end of the queue, is it still a heap? Obviously not, because 4 is smaller than 5, but it’s below 5. So that’s where heap comes in. Can we let it go from shiftDown to top heap? Obviously not, because adding a node to the end of the queue must be a leaf node, which is already at the bottom of the queue and has no children. And that’s another way of heaping, bottom-up heaping. If you compare 4 to its parent, 4 is less than 5, and you swap it with the parent, then 4 is at subscript 2. Comparing with the parent node, it is found that 4 is larger than 1. At this point, 4 finds its proper place in the heap. Corresponding to the shiftUp() method in the source code:

private void siftUp(int k, E x) {
    if(comparator ! =null)
        siftUpUsingComparator(k, x);
    else
        siftUpComparable(k, x);
}
    
private void siftUpUsingComparator(int k, E x) {
    while (k > 0) {
        int parent = (k - 1) > > >1; // Find the parent node of position k
        Object e = queue[parent];
        if (comparator.compare(x, (E) e) >= 0) // Compare x with the parent value
            break; // x is larger than the parent node
        queue[k] = e; // If x is smaller than the parent, swap elements
        k = parent; // k is equal to parent
    }
    queue[k] = x;
}
Copy the code

The parent node at k can be found by subscript k, which is also the result of the previous introduction to the heap. So what is the time complexity of the insert? It depends on the height of the heap, so the best time is order one, where you don’t swap elements, and the worst time is order log n, because the height of the heap is log n, and the worst case is all the way to the top of the heap. The average time is order log n.

After joining the team, let’s take a look at the exit.

poll()

Poll () is an out-queue operation, which removes the queue head element. If you think about it, a complete binary tree, you remove the top of the heap, it’s not a complete binary tree, it can’t be heaped. The source code is handled like this, after removing the team head element, temporarily move the team tail element to the team head, so that it is a complete binary tree, you can heap. The following diagram is easier to understand:

The value of the heap should obviously be shiftDown().

public E poll(a) { // Remove the queue header
    if (size == 0)
        return null;
    int s = --size;
    modCount++;
    E result = (E) queue[0];
    E x = (E) queue[s]; // Insert the tail element into the head of the queue and heap it from top to bottom
    queue[s] = null;
    if(s ! =0)
        siftDown(0, x);
    return result;
}
Copy the code

In addition to removing the queue head, PriorityQueue also supports removing nodes at any position, via the remove() method.

remove()

private E removeAt(int i) {
    // assert i >= 0 && i < size;
    modCount++;
    int s = --size;
    if (s == i) // removed last element
        queue[i] = null; // Delete the end of the queue
    else { // Delete other locations, which need to be heaped again to preserve heap characteristics
        E moved = (E) queue[s]; // Moved is the last element of the team
        queue[s] = null;
        siftDown(i, moved); // Insert the last element of the queue into position I and heap it from top to bottom
        if (queue[i] == moved) {
            siftUp(i, moved); // Moved not moved down, still at I, now need to heap bottom up again to ensure the heap is correct
            if(queue[i] ! = moved)returnmoved; }}return null;
}
Copy the code

If it is to delete the end of the team, directly delete all can be. But if you delete one of the nodes in the middle, you create a hole in the heap instead of a complete binary tree. In fact, the processing method is the same as poll. The last node of the team is temporarily filled to the deleted position to form a complete binary tree and then heap.

There are some inconsistencies in the heaping-poll process here. The first move is shiftDown(), which is heaped from top to bottom. The shiftDown() should be finished and queue[I] == moved. If not, node I has swapped down and it has found its position. But if they are equal, then node I is not swapped down, that is, the value of node I is smaller than any of its children. But that doesn’t mean it’s necessarily bigger than its parent. Therefore, this situation needs to be heaped from the bottom to ensure that it can fully comply with the characteristics of the heap.

conclusion

All this talk about PriorityQueue is really all about the heap. If you’re familiar with heaps, the source code for PriorityQueue is easy to understand. Of course, it doesn’t matter if you are not familiar with the source code just to learn the basic concept of the heap. Finally, a quick summary of priority queues:

  • PriorityQueue is heap-based. A heap is a special complete binary tree in which each node must have a value less than or equal to (or greater than or equal to) the value of each node in its subtree
  • PriorityQueue has both unqueued and enqueued operation timesO(log(n)), depends only on the height of the heap
  • PriorityQueue has an initial capacity of 11 and supports dynamic capacity expansion. If the capacity is less than 64, double the capacity. If the value is greater than 64, expand the capacity by 0.5 times
  • PriorityQueue does not allow null elements, does not allow non-comparable elements
  • PriorityQueue is not thread-safe, PriorityBlockingQueue is thread-safe

So much for PriorityQueue. The next article should be about Set.

Article first published wechat public account: Bingxin said, focus on Java, Android original knowledge sharing, LeetCode problem solving.

More JDK source code analysis, scan code to pay attention to me!