What is the Collector

Collector is a variable aggregation operation that accumulates input elements into a variable result container. After all elements are processed, the Collector converts the accumulated result into a final result. Collector supports both serial and parallel methods.

The Collector interface has three generics: T for the type of the input element, A for the type of the accumulated result container, and R for the type of the final generated result.

Collect method

A terminal operation used to collect data in a stream. The collect method takes a Collector

There are two overloaded methods in the Stream interface.

<R> R collect(Supplier<R> Supplier, BiConsumer<R,? super T> accumulator, BiConsumer<R, R>combiner); <R, A> R collect(Collector<? super T, A, R> collector);Copy the code

The role of Collectror

The collector, an interface, has a tool class, Collectors, that provides links to many factory methods

The role of the Collectors

Is a utility class that provides implementations of many common collectors

Collectors.toList()

public static <T> Collector<T, ? , List<T>> toList() { return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,(left, right) -> { left.addAll(right); return left; }, CH_ID); }Copy the code

ToMap and toConcurrentMap

Tomap operations can organize the input elements into a Map. Three methods can be used to override Tomap

1.toMap(Function<? super T, ? extends K> keyMapper, Function<? super T, ? Extends U> valueMapper) : **keyMapper and valueMapper provide the keys and values of the resulting Map respectively ** **** 2. ToMap (Function<? super T, ? extends K> keyMapper, Function<? super T, ? Extends U> valueMapper, BinaryOperator<U> mergeFunction) : mergeFunction accumulates values of the same key. ToMap (Function<? super T, ? extends K> keyMapper, Function<? super T, ? Extends U> valueMapper, BinaryOperator<U> mergeFunction, Supplier<M> mapSupplier) : mapSupplier can specify the Map type of the resultCopy the code

ArrayList::new (Creates an ArrayList as an accumulator)

List::add (Element operations in the flow are directly added to the accumulator)

Reduce operation (addAll is collected for subtasks, and the results of the latter subtask are directly added to the results of the former subtask)

CH_ID (is an unmodifiableSet collection)

Collector Working Principle

The Collector works together in four ways to perform data aggregation operations

(1) Supplier: creates a new result container

(2) Accumulator: The input element is merged into the result container

(3) combiner: merge two result containers (used in parallel streams, merge result containers generated by multiple threads)

(4) Finisher: Converts the result container into the final representation

Here’s an example of a string behavior

The supplier first provides a result container, and an Accumulator accumulates elements into the container, followed by a finisher that converts the result container into the final return structure. If the type of the result container is the same as the type of the final result, then the finisher is optional.

But combiner is associated with parallel streams, and this does not work in serial streams. If the operation on the element is split between three threads, the three threads will return the result container. At this point, combiner might do the same with the result containers of the three threads, merging them separately and returning them as a single container.

In addition to the four Collector methods, there is a Characteristics method that sets characteristic values for the Collector. Enumeration constants have three characteristic values:

(1) Concurrent: indicates that there is only one result container (the same is true for parallel streams). The lambda expression returned by combiner is executed only if the stream is parallel and the collector does not have this feature. Setting this feature means that multiple threads can be called by the same result container from the heap, so the result container must be thread-safe

(2) Unordered: Indicates that elements in the flow are out of order

(3) IDENTITY_FINISH: indicates that the intermediate result container type is the same as the final result type. The finiser method will not be called when this feature is set

Collector Joining function

Joining function Collectors. Joining

Joining ("param") Collectors. Joining (" PARAM1 ", "param2", "param3")Copy the code

The implementation of one of them

public static Collector<CharSequence, ? , String> joining() { return new CollectorImpl<CharSequence, StringBuilder, String>( StringBuilder::new, StringBuilder::append, (r1, r2) -> { r1.append(r2); return r1; }, StringBuilder::toString, CH_NOID); }Copy the code

Note: This method takes a Stream as a string, and a joining function takes a concatenation, prefix, and suffix between elements.

String result = Stream.of("springboot", "mysql", "html5",
"css3").collect(Collectors.joining(",", "[", "]"));
Copy the code

Collector partitioningBy grouping

Partitioning is a special case of grouping, and this operation divides the input elements into two classes (i.e., maps with true and false keys). Collectors provides two overloaded partitioningBy() methods:

partitioningBy(Predicate<? Super T> predicate) : The predicate provides the basis for partitioning

partitioningBy(Predicate<? super T> predicate,Collector<? Super T, A, D> downstream) : Downstream provides the value of the result Map.

public static <T> Collector<T, ? , Map<Boolean, List<T>>> partitioningBy(Predicate<? super T> predicate) { return partitioningBy(predicate, toList()); }Copy the code

Practice: group the strings in the list. The strings greater than 4 are one group and the others are another

List<String> list = Arrays.asList("java", "springboot",
"HTML5","nodejs","CSS3");
Map<Boolean, List<String>> result =
list.stream().collect(partitioningBy(obj -> obj.length() > 4));
Copy the code

Collector group by group

Grouping Collectors. GroupingBy ()

public static <T, K> Collector<T, ? , Map<K, List<T>>> groupingBy(Function<? super T, ? extends K> classifier) { return groupingBy(classifier, toList()); }Copy the code

Practice: group the students according to the province they live in

List<Student> students = Arrays. AsList (New Student(" guangdong ", 23), New Student(" Guangdong ", 24), new Student(" Guangdong ", 23), New Student(" Beijing ", 22), New Student(" Beijing ", 20), New Student(" Beijing ", 20), New Student(" Hainan ", 25)); Map<String, List<Student>> listMap = students.stream().collect(Collectors.groupingBy(obj -> obj.getProvince())); listMap.forEach((key, value) -> { System.out.println("========"); System.out.println(key); value.forEach(obj -> { System.out.println(obj.getAge()); }); }); class Student { private String province; private int age; public String getProvince() { return province; } public void setProvince(String province) { this.province = province; } public int getAge() { return age; public void setAge(int age) { this.age = age; } public Student(String province, int age) { this.age = age; this.province = province; }}Copy the code

On performance

While the few collectors provided by Stream can meet most development requirements, Reduce provides a wide variety of customizations, but sometimes a custom collector is required. Worth remembering is the collector is orderly, so it will not be able to parallel, the combiner method can be not to return UnsupportedOperationException to alert the parallelism of the collector.