A Stream in a Java8 collection is equivalent to an advanced version of Iterator. It can perform all sorts of very convenient and efficient Aggregate operations on the collection using Lambda expressions. Or Bulk Data operations. A Stream is just like an Iterator. It is one-way and irreversible. Data can only be traversed once and then exhausted, just like water flowing in front of you and never coming back. Functional solutions decouple code details from business logic, similar to SQL statements that say “what to do” instead of “how to do it”, allowing programmers to focus more on business logic and write code that is easy to understand and maintain
1. Basic Concepts
1.1 Stream Operation classification
Operations in Stream are officially classified into two categories: Intermediate operations and Terminal operations. The intermediate operation only records the operation, that is, only returns a stream, and does not perform the calculation, while the final operation implements the calculation.
The intermediate operations can be divided into Stateless and Stateful operations, which means that the processing of an element was not affected by the previous elements, and the latter means that the operation could be continued only after all elements were acquired.
Termination operations can be divided into short-circuiting (SHORT-circuiting) and unshort-circuiting (unshort-circuiting) operations. The former means that the final result can be obtained after certain elements meet the conditions, while the latter means that the final result can be obtained only after all elements are processed.
The operation classification details are as follows:(Image credit: Geek Time – Lecture 6 of Java Performance Tuning in Action)
Intermediate operations are also commonly referred to as lazy operations, and it is this lazy operation combined with a processing Pipeline of finalizing operations and data sources that makes a Stream efficient
1.2 Stream main schema class relationships
BaseStream interface
The parent interface to a stream is a sequence of elements that supports sequential and parallel aggregation operations and controls the behavior of all types of stream (including stream operations, stream pipes, parallel operations, and so on)
2.1 Interface Implementation
public interface BaseStream<T.S extends BaseStream<T.S>>
extends AutoCloseable {
Copy the code
<S> : The input data type must also be stream (S extends BaseStream<T, S>). <T> : The input data type inherits the AutoCloseable interface: The close() method is automatically executed after the stream operation is complete to perform some resource release operations
2.2 Method Description
The serial number | The method name | The ginseng | The ginseng | Functional specifications | note |
---|---|---|---|---|---|
1 | unordered | — | S | Remove the order constraint that must be maintained in the flow, allowing subsequent operations to use optimizations that do not have to consider order | |
2 | spliterator | — | Spliterator | Returns a split iterator for the current stream data | @NotNull |
3 | sequential | — | S | Returns a stream of data of the same type as the input | @NotNull |
4 | parallel | — | S | Returns the current stream data as a parallel stream | @NotNull |
5 | onClose | Runnable closeHandler | S | The method is called before the close method is called, in the order closeHandler is added. Pass the exception thrown by the first closeHandler to close() if there is an exception in execution. | |
6 | iterator | — | Iterator | Returns an iterator for the current stream of data | @NotNull |
7 | isParallel | — | boolean | This method is called to determine if a parallel operation is performed before executing an endpoint operation on the stream | |
8 | close | — | void | To close the stream, call all the close handlers for the stream pipeline | @Override AutoCloseable |
2.3 Main Inheritance relationship
BaseStream’s main inheritance relationships are as follows:BaseStream mainly has four implementation of the subclass interface, and a Pipeline abstract class implementation, this is a simple strategy mode implementation, subclass according to their own incoming data types of different methods according to their own special requirements for rewriting. StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape: StreamShape
StreamShape | |
---|---|
REFERENCE | Corresponding to Stream, the element is the Object entity Object |
INT_VALUE | For IntStream, the element is of type int |
LONG_VALUE | Corresponding to LongStream, the element is of type long |
DOUBLE_VALUE | Corresponding to DoubleStream, the element is of type double |
3. Stream interface
3.1 Overview (Interface Documents)
What is a stream?
Is a queue of elements that supports serial/parallel aggregation operations. To do the calculation, a Stream pipeline is built, A stream pipeline consists of a source (which may be arrays, collections, constructors, I/O streams, etc.), zero to multiple intermediate operations, and terminal operations.
What are the features of stream?
– the stream is lazy, the source data is evaluated only when the finalization operation is initiated, the stream evaluation is not triggered when only intermediate operations are performed, and the source element is used only when needed.
Although there are some superficial similarities between stream and set, their emphasis is not the same. Collections are primarily the management and access of elements. A Stream, by contrast, does not provide a way to directly access or manipulate elements. Instead, it focuses on declaratively describing the aggregation and evaluation operations of elements. However, if the provided stream operations do not provide the desired functionality, these operations can be used to perform controlled traversal.
– If Stram Popeline modifs source data during operation, unexpected errors may occur
– arguments will call Function interfaces or usually use lambda expression references, so they must be “non-null”
In a stream, data is executed only once from source to termination, data cannot be manipulated repeatedly, and an IllegalStateException will be thrown if repeated use occurs. (Since some stream operations may return their sink rather than a new stream object, it may not be possible to detect reuse in all cases.)
– stream has a close() method and implements the AutoCloseable interface (BaseStream inheritance), but most stream instances do not need to be closed after use. In general, only streams whose source is an IO channel need to be closed. Most streams consist of collections, arrays, or generators, functions that require no special resource management
A stream pipe can execute data serially and in parallel, using the paralle() method to select a different mode of execution
3.2 Some common methods
filter()
Action: Intermediate – Stateless operation that filters stream data and returns data that meets the conditions of the Predicate. Stream: returns a new Stream that matches the conditions of the predicate data. Example: Filters elements less than 3
public static void main(String[] args) {
List<Integer> list = Stream.of(1.2.3.4).filter(p -> p < 3).collect(Collectors.toList());
System.out.println(list);
}
Copy the code
Output: [1, 2]
map()
Functions: intermediate – stateless operation, which internally performs a series of operations on the data in the form of a Function, and returns the result as a set of parameters: Function<? super T, ? Extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper: extends R> mapper
public static void main(String[] args) {
List<Integer> list = Stream.of(1.2.3.4).map(p -> p + 1).collect(Collectors.toList());
System.out.println(list);
}
Copy the code
Output: [2, 3, 4, 5]
distinct()
Role: middle – there are state operation, on the current stream data elements to heavy operation (by the equals () method), returns a stream data composed of different elements, for an orderly flow, the choice of different elements is stable (for repeating elements, retain they encounter the first element), for disorderly flow, do not provide guarantee stability. In the case of a paralle pipeline, it is more expensive to keep the sorted stability of the stream data (requiring the current operation to act as a complete barrier with significant caching overhead).
In general, the business code does not need to maintain the sort stability, but it can better improve the execution efficiency in parallel pipes by removing the sort constraints without a stream data source (generate()) or by unordered() method. If sorting stability is a must, it is best to switch to sequential execution to improve performance.
Input parameter: the current Stream data is de-duplicated without input parameter (this) Output parameter: Stream: a new Stream data after de-duplicated example: de-duplicated elements
public static void main(String[] args) {
List<Integer> list = Stream.of(1.2.4.4).distinct().collect(Collectors.toList());
System.out.println(list);
}
Copy the code
sorted()
What it does: Intermediate – stateful operation that naturally sorts the current stream data (by compareTo()) and throws a ClassCastException if the current stream data is not Comparable. Similarly, for ordered flows, sorting is stable. For disordered flows, no stability guarantee is provided. Note: This method also provides overloaded methods. The input parameters are the Comparator interface objects. This method sorts according to the rules provided by the Comparator. Example: Sorting elements
public static void main(String[] args) {
List<Integer> list = Stream.of(1.3.4.2).sorted().collect(Collectors.toList());
System.out.println(list);
}
Copy the code
Output: [1, 2, 3, 4]
peek()
Function: Intermediate – stateless operation, which does something to Stream data, but it only does something to the elements in the Stream (output, etc.), but the data after the operation is not returned to the Stream, so it does not return the original elements. It does not return a new type of Stream data as map() does. It is usually used as debug to print intermediate data.
It is important to note that peek() can change the property values of the stream data if it is an entity object by calling setter methods on the entity object — that is, peek() can modify the stream data in a way that no other method can (it creates a new object as the result).
The ginseng: Consumer
action: function to the operation of the current flow data involved: Stream: the Stream data note: map () and peek (), the difference between a set: www.cnblogs.com/flydean/p/j… Example: Output each element value
public static void main(String[] args) {
List<Integer> list = Stream.of(1.2.3.4).peek(p -> System.out.println(Output: p =" + p)).collect(Collectors.toList());
System.out.println(list);
}
Copy the code
Output:
Output: p = 1
Output: p = 2
Output: p = 3
Output: p = 4
[1, 2, 3, 4]
The stream interface also provides methods such as limit (to limit the length of a stream), min (to return a small element), and Max (to return a large element). The principle is similar.
Three, extension,
Extension 1: Stateless and stateful streams
Stateless and stateful convection operations in JAVA 8 Field are explained as follows:
Operations like Map and filter take each element from the input stream and produce zero or one result in the output stream. As a result, these operations are typically stateless: they have no internal state (assuming that a user-supplied lambda or method reference has no internal mutable state).
But operations like reduce, sum, and Max require internal state to accumulate results. In this case, the internal state is small. In our case, it includes an int or double. No matter how many elements there are in the flow being processed, the internal state is bounded in size. In contrast, some operations (such as sorted or DISTINCT) first look like a filter or map — all of which take a stream and generate another stream (intermediate operations), but with one key difference. Sorting and removing duplicates from a stream both require knowledge of previous history to do their job. For example, sorting requires that all elements be buffered before a single item is added to the output stream. The storage requirements of the operation are unlimited. This can be a problem if the data flow is large or infinite. What should reverse the flow of all prime numbers? It should return the largest prime number, which math tells us doesn’t exist. We call these operations stateful operations.
To put it simply:
Stateless: get each element from the input stream and produce zero or one result in the output stream. There is no stored data to maintain the internal state, just produce the result after processing each element. Is unbounded, no matter how much stream data can be processed normally.
Stateful: Each element is taken from the input stream and evaluated, but an internal state is required to accumulate the result (such as the current maximum, minimum). The internal state is bounded no matter how many elements are in the stream being processed, so it is bounded.
Extension 2: Predicate interface
Predicate is an Predicate interface that takes parameters <T, Boolean >, that is, given a parameter T, returns a Boolean result. Like Function, the exact implementation of Predicate depends on the lambda expression passed in.
boolean test(T t);
Copy the code
Extension three: Sort stability
It is assumed that there are multiple identical elements in a sequence to be sorted. If the relative order of these elements remains unchanged after the sorting operation, the algorithm is stable; otherwise, it is unstable.
Eg. In the original sequence, node1==node2 && node1 is sorted before node2: if node1 is before node2 and this is the case no matter how many times it is executed, the algorithm is considered stable, otherwise it is unstable.
Welcome to the public account [Xiao Xiao loves meat]