This article has participated in the activity of “New person creation Ceremony”, and started the road of digging gold creation together

Hello everyone, I am Tong Yan Wuji. I am a city lion who does not do his duty. I believe that “practice makes real knowledge, life is simpler” and YEARN for freedom.

The bad code

Java8 already has the Stream processing method, but when the actual business development, most students still subconsciously write for double-layer loop.

A glance through the bustling… This code is a typical for double-layer loop. If there are duplicate object elements in List

, we will throw a business exception.

In fact, the business scenario is not complicated, so you can use the Stream method, so why do people still use the for double-layer loop? Is it habit, or is it performance?

Run an experiment, check it out…

Proving ground

First, simulate a scene: students in the class, students and class matching, check whether the students really have a class.

  1. Start with student and class

  2. Make 10W+ student number and class number, is not suspected is the performance? A small quantity will not do

  3. Let’s do a two-layer for loop

    // for double loop mode
    private static void doubleForMethod(List<Student> studentList, List<NoClass> noClassList) {
        // Now match the student to the class. If the class number is the same, consider the student to be from the class
        for (int i = 0; i < studentList.size(); i++) {
            Student student = studentList.get(i);
            for (int j = 0; j < noClassList.size(); j++) {
                NoClass noClass = noClassList.get(j);
                if (student.getClassesId().equals(noClass.getClassId())) {
                    // system.out.println (" student :" + student.getstuid () + "class ");}}}}Copy the code
  4. Let’s do a Stream

    private static void streamMethod(List<Student> studentList, List<NoClass> noClassList) {
        // Convert the list of classes to a map, so that the class ID is the unique ID
        Map<String, NoClass> noClassMap = noClassList.stream().collect(Collectors.toMap(t -> t.getClassId(), t -> t));
        // Now match the student to the class. If the class number is the same, consider the student to be from the class
        studentList.stream().forEach(h -> {
            if (noClassMap.containsKey(h.getClassesId())) {
                / / System. Out. Println (" the students: "+ h.g. etStuId () +" is a class ");}}); }Copy the code
  5. Run to compare the time consumption of the two

    The end result:

    For double-layer cycle time: 80438ms

    Stream indicates the Stream mode. Time: 80ms

For the same business scenario, the two results are the same, but the time is so different that it is amazing.

What is Stream doing behind the scenes?

Actually, I’m not sure. Let’s learn.

If you start with an understanding of how the lowest level Stream is implemented, you’re doing yourself a disservice; For (int I =0; i

Answer, affirmation is no ah. For a knowledge of the master, from shallow to deep, know its characteristics and explore the reason, if you must first look at the Stream source, ok. I’m looking forward to it, watching you quietly.

The classification of the Stream

Before we understand the Stream principle, we need to know its classification of operations, because the classification of operations of a Stream is one of the reasons for an efficient set of iterations.

Operations in Stream are officially classified into two categories: Intermediate operations and Terminal operations. The intermediate operation only records the operation, that is, only returns a stream, and does not perform the calculation, while the final operation implements the calculation.

The intermediate operations can be divided into Stateless and Stateful operations, which means that the processing of an element was not affected by the previous elements, and the latter means that the operation could be continued only after all elements were acquired.

Termination operations can be divided into short-circuiting (SHORT-circuiting) and unshort-circuiting (unshort-circuiting) operations. The former means that the final result can be obtained after certain elements meet the conditions, while the latter means that the final result can be obtained only after all elements are processed.

Intermediate operations are also commonly referred to as lazy operations, and it is this lazy operation combined with a processing Pipeline of finalizing operations and data sources that makes a Stream efficient.

The characteristics of the Stream

  1. A data flow obtains data sources from one end and operates on elements in sequence on the pipeline. When elements pass the pipeline, they cannot be operated on and a new data flow can be acquired from the data source for operation.

  2. When processing a Collection, the Iterator traversal is used, which is an external iteration.

    For Stream processing, as long as the processing method is declared, the processing process is completed by the Stream object itself. This is an internal iteration. For the iterative processing of a large amount of data, the internal iteration is more efficient than the external iteration.

The performance of the Stream

Can Stream completely replace for with better performance? May not.

According to official efficiency data:

  1. Compare the performance of a 100-length int array in a multi-core CPU server configuration

    Regular iteration <Stream parallel iteration <Stream serial iteration

  2. In the multi-core CPU server configuration environment, compare the performance of the 1.00E+8 int array;

    Stream parallel iteration < regular iteration <Stream serial iteration

  3. In the multi-core CPU server configuration environment, compare the performance of filtering group of 1.00E+8 object array;

    Stream parallel iteration < regular iteration <Stream serial iteration

  4. In the single-cpu server configuration environment, compare the performance of filtering groups with 1.00E+8 object arrays.

    Regular iteration <Stream serial iteration <Stream parallel iteration

Use Stream instead

According to the official performance statistics, using Stream may not improve traversal performance. The specific data volume depends on the actual application scenario.

However, in our daily business development, I suggest more use of Stream. Efficiency is a factor to consider when writing code, but it is not an absolute factor. With the development of technology, execution efficiency will improve rapidly with the development of hardware. For people who write code, the code must be concise principle, the loss of a little efficiency, in exchange for highly readable code, I think is well worth it.