How can Java lists be iterated more efficiently?

Java has three ways of traversing:

  • Plain for loop traversal (for)

  • Enhanced for loop traversal (foreach)

  • Iterator loop (iterator)

These three traversal methods are different.

Here’s a program to compare how long different lists take to traverse in different ways:

import java.util.ArrayList; import java.util.Iterator; import java.util.LinkedList; import java.util.List; /** * @author Zhangjianhua * @date 2021-05-09 16:08 * @description list for loop */ @suppressWarnings ("ALL") public class Main  { public static void main(String[] args) { testForeachTransform(); long arrayListForTime = arrayListForTime(); System.out.println(" test ArrayList for traversal time: "+ arrayListForTime); long arrayListForeachTime = arrayListForeachTime(); System.out.println(" arrayListForeachTime + arrayListForeachTime "); long arrayListIteratorTime = arrayListIteratorTime(); System.out.println(" Test ArrayList iterator elapsed time: "+ arrayListIteratorTime); long linkedListForTime = linkedListForTime(); System.out.println(" Test LinkedList by traversing time: "+ linkedListForTime); long linkedListForeachTime = linkedListForeachTime(); System.out.println(" + linkedListForeachTime "); long linkedListIteratorTime = linkedListIteratorTime(); System.out.println(" Test LinkedList traversal time through iterator: "+ linkedListIteratorTime); // According to the test result: // LinkedList is much faster to iterate through iterator than to iterate through for // In our application, we need to consider which implementation class to use for the List interface, // List implements RandomAccess interface to distinguish which implementation class // List implements RandomAccess interface to traverse data, // A List that does not implement RandomAccess iterator iterates through the data, Private static void testForeachTransform() {private static void testForeachTransform() { System.out.println(" Test what foreach converts to "); // For primitive arrays, foreach is converted to for because iterator is not implemented. // For lists, foreach is converted to iterator. Int [] Arrays = {1, 2, 3, 4, 5}; for (int i = 0; i < arrays.length; ++i) { } for (int array : arrays) { } List<Integer> list = new ArrayList<>(); list.add(1); list.add(2); list.add(3); list.add(4); list.add(5); for (int i = 0; i < list.size(); ++i) { } for (int element : Private static Long arrayListForTime() {list <Integer> arrayList = new ArrayList<>(); for (int i = 0; i < 100000; i++) { arrayList.add(i); } long startTime = system.currentTimemillis (); For (int I = 0; i < arrayList.size(); i++) { arrayList.get(i); } long endTime = system.currentTimemillis (); // Return endtime-startTime; } /** * private static long arrayListForeachTime() {List<Integer> arrayList = new ArrayList<>(); for (int i = 0; i < 100000; i++) { arrayList.add(i); } long startTime = system.currentTimemillis (); For (int array: arrayList) {} long endTime = system.currentTimemillis (); // Return endtime-startTime; } /** * private static long arrayListIteratorTime() {List<Integer> arrayList = new ArrayList<>(); for (int i = 0; i < 100000; i++) { arrayList.add(i); } long startTime = system.currentTimemillis (); Iterator<Integer> iterator = arraylist.iterator (); while (iterator.hasNext()) { iterator.next(); } long endTime = system.currentTimemillis (); // Return endtime-startTime; Private static long linkedListForTime() {List<Integer> LinkedList = new LinkedList<>(); for (int i = 0; i < 100000; i++) { linkedList.add(i); } long startTime = system.currentTimemillis (); For (int I = 0; i < linkedList.size(); i++) { linkedList.get(i); } long endTime = system.currentTimemillis (); // Return endtime-startTime; } /** * private static long linkedListForeachTime() {List<Integer> linkedList = new LinkedList<>(); for (int i = 0; i < 100000; i++) { linkedList.add(i); } long startTime = system.currentTimemillis (); For (int array: linkedList) {} long endTime = system.currentTimemillis (); // Return endtime-startTime; } /** * private static long linkedListIteratorTime() {List<Integer> linkedList = new LinkedList<>(); for (int i = 0; i < 100000; i++) { linkedList.add(i); } long startTime = system.currentTimemillis (); Iterator<Integer> iterator = linkedlist.iterator (); while (iterator.hasNext()) { iterator.next(); } long endTime = system.currentTimemillis (); // Return endtime-startTime; }}Copy the code

To see the results:

We can see that for a sequential ArrayList, ordinary for, enhanced for, and iterator loops are not very different, each costing 2ms to traverse 100,000 times.

However, for LinkedList, the difference between the normal for loop and the enhanced for loop and iterator loop is very large. The normal for loop takes 4390ms to traverse 100,000 times, while the enhanced for loop and iterator loop only takes 3ms and 1ms to traverse 100,000 times.

Why is that?

Because the sequential list is traversed in any of three ways, the time is order n. A chained list traversed by an ordinary for loop is O(n^2), but a chained list traversed by an enhanced for loop and an iterator loop is O(n), an order of magnitude different, so a chained list traversed by an ordinary for loop is slow.

First, an enhanced for loop for a list is the equivalent of an iterator loop.

How do you see that?

If you look at the compiled class file, you can decompile to see the list’s enhanced for loop converted into an iterator loop. It is important to note, however, that an enhanced for loop for an array of primitive type is converted to a normal for loop, not to an iterator loop, because an array of primitive type does not define an iterator loop, so it is converted to a normal for loop.

The testForeachTransform method is used to convert the list enhanced for loop to the primitive array enhanced for loop. The testForeachTransform method is used to convert the list enhanced for loop to the primitive array enhanced for loop. The testForeachTransform method is used to convert the list enhanced for loop to the primitive array enhanced for loop.

Because the enhanced for loop converts to an iterator loop for lists, and lists are divided into sequential and chain types, we analyze them from four aspects:

  • Ordinary for loops for sequential lists
  • An enhanced for loop for sequential lists
  • Generic for loops for chained lists
  • An enhanced for loop for a chained list

Why does a chained list loop with an iterator take an exponential order of magnitude less time than a regular for loop? We can find the answer by looking at the source code.

Here’s how to look at the source code, how to find the answer to traversal efficiency from the source code:

To prepare

Here is jdk1.8 as a test, find the test source from GitHub: github.com/zjhpure/sou…

Open with IDEA, as shown in the figure.

Ordinary for loops for sequential lists

Find the arrayListForTime method as shown below:

As can be seen from the code, the code between the start time and the end time is the time consumed by the for loop. Each iteration calls the GET method, so as long as we know how much time is consumed by the GET method, we can know how much time is consumed by the for loop. We first look at the source of the get method through the jump function of IDEA, but after the jump, an interface is displayed, as shown in the figure below:

How to do? How do I find the source code for the GET method?

You need to debug through the break point to find the source of the GET method. After running, the program will jump to the get method and stop. During debugging, execute Force Step Into and jump Into the source code of the GET method. Then you can see the source code of the GET method, as shown below:

The get method looks like this:

If you encounter the interface while debugging the source code of the get method, you can see the source code of the method by continuing to execute Force Step Into. There will be a lot of look at the source code behind the operation, will not be detailed to write how to find the source code method, also according to the above operation can be found.

Going back to the source code, there are two parts, executing the rangeCheck method first and then the elementData method. The rangeCheck method checks whether the input subscript exceeds the size of the array.

/** * Checks if the given index is in range. If not, throws an appropriate * runtime exception. This method does *not* check if the index is * negative: It is always used immediately prior to an array access, * which throws an ArrayIndexOutOfBoundsException if index is negative. */ private void rangeCheck(int index) { if (index  >= size) throw new IndexOutOfBoundsException(outOfBoundsMsg(index)); }Copy the code

The elementData method of the second step is as follows:

// Positional Access Operations @SuppressWarnings("unchecked") E elementData(int index) { return (E) elementData[index];  }Copy the code

Here is the element directly from the elementData array, because it is a sequential list, the bottom is an array to save the elements, through idea click jump elementData can be seen that elementData is an array, as shown in the figure below:

Therefore, it can be seen that for the ordinary for loop of sequential list, the time complexity of each get method is O(1), and the time complexity of traversing n elements is O(n).

An enhanced for loop for sequential lists

For lists, the enhanced for loop converts to an iterator loop, which can be found in the Java class file compiled after the run. We first find the testForeachTransform method, as shown below:

Click run to see the compiled class file. We can see that the enhanced for loop is converted to a normal for loop for primitive arrays and the enhanced for loop to an iterator for lists, as shown below:

So the enhanced for loop for analyzing lists is essentially an iterator loop for analyzing lists, so let’s examine the iterator loop for sequential lists

We find the arrayListIteratorTime method as shown below:

Each iteration calls the next method, so knowing how much time the next method takes tells you how much time the iterator loop takes. Iterator (iterator) iterator (arrayList) iterator (arrayList) iterator (arrayList) iterator (arrayList) iterator (arrayList) iterator (arrayList) iterator (arrayList) iterator (arrayList)

    /**
     * Returns an iterator over the elements in this list in proper sequence.
     *
     * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
     *
     * @return an iterator over the elements in this list in proper sequence
     */
    public Iterator<E> iterator() {
        return new Itr();
    }
Copy the code

In iterator, create an Itr object. The Itr class is an inner class.

/** * An optimized version of AbstractList.Itr */ private class Itr implements Iterator<E> { int cursor; // index of next element to return int lastRet = -1; // index of last element returned; -1 if no such int expectedModCount = modCount; Itr() {} public boolean hasNext() { return cursor ! = size; } @SuppressWarnings("unchecked") public E next() { checkForComodification(); int i = cursor; if (i >= size) throw new NoSuchElementException(); Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) throw new ConcurrentModificationException(); cursor = i + 1; return (E) elementData[lastRet = i]; } public void remove() { if (lastRet < 0) throw new IllegalStateException(); checkForComodification(); try { ArrayList.this.remove(lastRet); cursor = lastRet; lastRet = -1; expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } } @Override @SuppressWarnings("unchecked") public void forEachRemaining(Consumer<? super E> consumer) { Objects.requireNonNull(consumer); final int size = ArrayList.this.size; int i = cursor; if (i >= size) { return; } final Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) { throw new ConcurrentModificationException(); } while (i ! = size && modCount == expectedModCount) { consumer.accept((E) elementData[i++]); } // update once at end of iteration to reduce heap write traffic cursor = i; lastRet = i - 1; checkForComodification(); } final void checkForComodification() { if (modCount ! = expectedModCount) throw new ConcurrentModificationException(); }}Copy the code

As you can see, the Itr class has a next method and a hasNext method. HasNext is called in the condition of the while loop. HasNext is used to determine whether the cursor is equal to the size of the list.

Next method (); next method ();

ArrayList. This. ElementData is to obtain a list of the array, and then the cursor = I + 1, is the cursor plus one, the last element from an array elementData, return element.

So in summary, for an enhanced for loop (iterator loop) with sequential lists, the time complexity of each next method is O(1) and the time complexity of traversing n elements is O(n).

Generic for loops for chained lists

Find the linkedListForTime method as shown below:

Reading the source code in the same way, we can see that each iteration calls the GET method, so if we know how long the GET method takes, we can know how long the for loop takes. Find the source of the get method as follows:

/** * Returns the element at the specified position in this list. * * @param index index of the element to return * @return the element at the specified position in this list * @throws IndexOutOfBoundsException {@inheritDoc} */ public E  get(int index) { checkElementIndex(index); return node(index).item; }Copy the code

Call the checkElementIndex method and return node(index).item. The checkElementIndex method looks like this:

private void checkElementIndex(int index) { if (! isElementIndex(index)) throw new IndexOutOfBoundsException(outOfBoundsMsg(index)); }Copy the code

IsElementIndex (); isElementIndex ();

    /**
     * Tells if the argument is the index of an existing element.
     */
    private boolean isElementIndex(int index) {
        return index >= 0 && index < size;
    }
Copy the code

The rangeCheck method is the same as the rangeCheck method used to check the validity of an element’s subscript. Node (index).item = node(index).item

/** * Returns the (non-null) Node at the specified element index. */ Node<E> node(int index) { // assert isElementIndex(index); if (index < (size >> 1)) { Node<E> x = first; for (int i = 0; i < index; i++) x = x.next; return x; } else { Node<E> x = last; for (int i = size - 1; i > index; i--) x = x.prev; return x; }}Copy the code

Because it’s a linked list, the bottom layer holds the elements in a linked list. The whole process can be seen from the source of the node method, determines the subscript position is in the middle or in the middle on the left on the right, and if the middle on the left, then from the start node to search one by one, if it is in the middle on the right, then from the front end node to search one by one, so the chain list to find an element of time complexity is O (n).

So you can see that a chained list is much less efficient at finding an element than a sequential list, because it takes O(1) to find an element, and O(n) to find an element in a chained list.

Because there are n elements to iterate over, the time for a chained list to iterate over n elements is O(n^2), and the ordinary for loop for a chained list is slow, an order of magnitude slower than the ordinary for loop for a sequential list.

But the chain list to find an element is not every n times to find, according to the source here can be found, according to the index in the middle right or left to judge from the start node or from the end node forward, combined, looking for the time is 1,2,3,4… N / 2 n / 2… 4,3,2,1.

Total degree :(1+n/2)n/2=3/4n^2, three quarters of n squared, which is also an order of magnitude of n squared.

So it can be seen that for an ordinary for loop with a chained list, the time complexity of each get method is O(n), and the time complexity of traversing n elements is O(n^2).

An enhanced for loop for a chained list

For chained lists, the enhanced for loop is also converted to an iterator loop.

So the enhanced for loop for parsing lists is essentially the iterator loop for parsing lists. Let’s look at the iterator loop for chained lists

We find the linkedListIteratorTime method as shown below:

Each iteration calls the next method, so knowing how much time the next method takes tells you how much time the iterator loop takes. Iterator () : iterator () : iterator () : iterator () : iterator () : iterator () : iterator () : iterator () : iterator () : iterator () : iterator ();

    /**
     * Returns an iterator over the elements in this list (in proper
     * sequence).<p>
     *
     * This implementation merely returns a list iterator over the list.
     *
     * @return an iterator over the elements in this list (in proper sequence)
     */
    public Iterator<E> iterator() {
        return listIterator();
    }
Copy the code

In iterator method, call listIterator method, source code as follows:

    /**
     * {@inheritDoc}
     *
     * <p>This implementation returns {@code listIterator(0)}.
     *
     * @see #listIterator(int)
     */
    public ListIterator<E> listIterator() {
        return listIterator(0);
    }
Copy the code

ListIterator (0); listIterator(0); listIterator(0); listIterator(0); listIterator(0);

    /**
     * Returns a list-iterator of the elements in this list (in proper
     * sequence), starting at the specified position in the list.
     * Obeys the general contract of {@code List.listIterator(int)}.<p>
     *
     * The list-iterator is <i>fail-fast</i>: if the list is structurally
     * modified at any time after the Iterator is created, in any way except
     * through the list-iterator's own {@code remove} or {@code add}
     * methods, the list-iterator will throw a
     * {@code ConcurrentModificationException}.  Thus, in the face of
     * concurrent modification, the iterator fails quickly and cleanly, rather
     * than risking arbitrary, non-deterministic behavior at an undetermined
     * time in the future.
     *
     * @param index index of the first element to be returned from the
     *              list-iterator (by a call to {@code next})
     * @return a ListIterator of the elements in this list (in proper
     *         sequence), starting at the specified position in the list
     * @throws IndexOutOfBoundsException {@inheritDoc}
     * @see List#listIterator(int)
     */
    public ListIterator<E> listIterator(int index) {
        checkPositionIndex(index);
        return new ListItr(index);
    }
Copy the code

ListItr = ListItr; ListItr = ListItr; ListItr = ListItr; ListItr = ListItr;

private void checkPositionIndex(int index) { if (! isPositionIndex(index)) throw new IndexOutOfBoundsException(outOfBoundsMsg(index)); }Copy the code

In checkPositionIndex method, call isPositionIndex method, source code as follows:

    /**
     * Tells if the argument is the index of a valid position for an
     * iterator or an add operation.
     */
    private boolean isPositionIndex(int index) {
        return index >= 0 && index <= size;
    }
Copy the code

IsPositionIndex = rangeCheck; isPositionIndex = rangeCheck; isPositionIndex = rangeCheck; ListItr class ListItr class ListItr class ListItr

private class ListItr implements ListIterator<E> { private Node<E> lastReturned; private Node<E> next; private int nextIndex; private int expectedModCount = modCount; ListItr(int index) { // assert isPositionIndex(index); next = (index == size) ? null : node(index); nextIndex = index; } public boolean hasNext() { return nextIndex < size; } public E next() { checkForComodification(); if (! hasNext()) throw new NoSuchElementException(); lastReturned = next; next = next.next; nextIndex++; return lastReturned.item; } public boolean hasPrevious() { return nextIndex > 0; } public E previous() { checkForComodification(); if (! hasPrevious()) throw new NoSuchElementException(); lastReturned = next = (next == null) ? last : next.prev; nextIndex--; return lastReturned.item; } public int nextIndex() { return nextIndex; } public int previousIndex() { return nextIndex - 1; } public void remove() { checkForComodification(); if (lastReturned == null) throw new IllegalStateException(); Node<E> lastNext = lastReturned.next; unlink(lastReturned); if (next == lastReturned) next = lastNext; else nextIndex--; lastReturned = null; expectedModCount++; } public void set(E e) { if (lastReturned == null) throw new IllegalStateException(); checkForComodification(); lastReturned.item = e; } public void add(E e) { checkForComodification(); lastReturned = null; if (next == null) linkLast(e); else linkBefore(e, next); nextIndex++; expectedModCount++; } public void forEachRemaining(Consumer<? super E> action) { Objects.requireNonNull(action); while (modCount == expectedModCount && nextIndex < size) { action.accept(next.item); lastReturned = next; next = next.next; nextIndex++; } checkForComodification(); } final void checkForComodification() { if (modCount ! = expectedModCount) throw new ConcurrentModificationException(); }}Copy the code

ListItr has a next method and a hasNext method. HasNext is called in the condition of the while loop. HasNext is used to determine whether the cursor nextIndex equals the length of the list. The while loop ends.

The next method is iterated back through the list, as shown in the following figure:

So the iterator iterates over n elements in O(n), which is an order of magnitude faster than the normal for loop over n elements in O(n^2).

Why is it suddenly an order of magnitude lower?

We carefully compared the source code can be seen above, in fact the iterator loop traverse just change the train of thought, for ordinary for loop chain list, each time to find an element should be from the beginning of the node list or the end node to the middle one by one to find, so look for an element, the time complexity is O (n). But let’s think about it, because we’re going to go through all the elements, can we not go back and find them every time? Certainly, because each element holds the address of the next node, so you can change a way of thinking, starting from the start node traversal, find the next node through the node, to find and down through the next node to a node, until the end node, every time you don’t need to find an element need to look for a time.

Iterator loop traversal is the idea above. A simple change in thinking can reduce time by an order of magnitude.

For chained lists, the time complexity of each get method is O(n) using the ordinary for loop, but the time complexity of each next method is O(1) using the iterator loop, so the time can be significantly reduced. The chained list takes O(1) for each next method because it moves the pointer back one bit, rather than looking all the way to the end like the get method in a normal for loop.

So to sum up, for an enhanced for loop (iterator loop) with a chained list, the time complexity of each next method is O(1), and the time complexity of traversing n elements is O(n).

conclusion

According to the test results:

  • ArrayList traverses 100,000 pieces of data through a normal for loop, taking 2ms and time complexity O(n).
  • ArrayList traverses 100,000 pieces of data through an enhanced for loop, taking 2ms and time complexity O(n).
  • The ArrayList iterates over 100,000 pieces of data in 2ms, O(n) time.
  • LinkedList traverses 100,000 pieces of data through a normal for loop, taking 4390ms and O(n^2), which is significantly slower.
  • LinkedList traverses 100,000 pieces of data through an enhanced for loop, taking 3ms and time complexity O(n).
  • LinkedList iterates through 100,000 pieces of data in 1ms, O(n) time.

Conclusion:

  • For sequential lists, use a plain for loop to iterate over the data
  • For chained lists, use an enhanced for loop to iterate over the data

Public account: Pure programming said (chunjie_tech)

Original link:Mp.weixin.qq.com/s/A8Izwja_Z…