Stream API
This article continues with another new Java 8 feature, the Stream API. The Stream API is an enhancement to the collection operations in Java that can be used for filtering, sorting, grouping, aggregation, and more. The Stream API works with Lambda expressions to improve code readability and coding efficiency. The Stream API also supports parallel operations. We don’t have to spend a lot of effort writing error-prone multithreaded code. And take full advantage of the multi-core CPU. With the Stream API and Lambda, developers can easily write high-performance concurrency handlers.
Introduction to the
The Stream API is a new set of apis added to Java 8. It handles collection operations in a different way than the traditional one, called “data Stream processing.” A Stream is a declarative operation similar to a relational database query operation. For example, to get the names of all users older than 20 from the database and sort them by their creation time, a single SQL statement can do this, but using A Java program is a bit tedious, so you can use a stream:
List<String> userNames =
users.stream()
.filter(user -> user.getAge() > 20)
.sorted(comparing(User::getCreationDate))
.map(User::getUserName)
.collect(toList());
Copy the code
In this era of big data, data is becoming more and more diversified, many times we will face the huge amounts of data, and do some complex operations (such as statistics, branch), in accordance with the traditional way of traversal (for-each), every time can only handle one element of the set, and order processing, this method is extremely inefficient. You might think of parallel processing, but writing multithreaded code is not easy, error-prone, and difficult to maintain. But after Java 8, you can use the Stream API to solve this problem.
You can see that most of the arguments in the Stream API are in various forms of a functional interface.
For those of you who don’t know functional interfaces, check out these two
Java8 Lambda expressions do you understand?
Do you really understand Java8’s functional interfaces?
The Stream API encapsulates the iteration operation internally. It automatically selects the optimal iteration method. In parallel processing, the set is divided into multiple segments, each segment is processed by different threads, and finally the processing results are combined and output.
Note that “a stream can only be traversed once”, and the stream is closed when the traversal is complete. If you want to iterate, you can retrieve a new stream from the data source (collection). If you have a flow traversal twice, will be thrown. Java lang. An IllegalStateException anomaly:
List<String> list = Arrays.asList("A"."B"."C"."D");
Stream<String> stream = list.stream();
stream.forEach(System.out::println);
stream.forEach(System.out::println); / / here will be thrown Java. Lang. An IllegalStateException is unusual, because the stream has been closed
Copy the code
Circulation usually consists of three parts:
- Data sources: Data sources are commonly used to fetch streams, such as the users.stream() method in the example of filtering users at the beginning of this article.
- Intermediate processing: Intermediate processing includes a series of processing of elements in the stream, such as filtering (filter()), mapping (map()), sorting (sorted()).
- Terminal processing: Terminal processing generates a result, which can be any value that is not a stream, such as List; ForEach (system.out ::println) prints the result to the console and does not return it.
Intermediate operations are also called lazy operations. Terminal operation is also called eager operation. Lazy operations do not process elements until hot operations are invoked on the flow. Intermediate operations on a stream produce another stream. The Streams link operation creates a stream pipe.
Create a flow
There are many ways to create a stream, which can be broken down into the following:
Create a flow from a value
Create a Stream using the static stream.of () method, which takes a variable length argument:
Stream<Stream> stream = Stream.of("A"."B"."C"."D");
Copy the code
You can also create an empty Stream using the static method stream.empty () :
Stream<Stream> stream = Stream.empty();
Copy the code
Create streams from arrays
Arrays.stream() creates a stream from an array using the static method Arrays.stream(), which takes an array argument:
String[] strs = {"A"."B"."C"."D"};
Stream<Stream> stream = Arrays.stream(strs);
Copy the code
Generate streams from files
You can get streams using many static methods in the java.nio.file.files class, such as the files.lines () method, which takes a java.nio.file.path object and returns a string stream of file lines:
Stream<String> stream = Files.lines(Paths.get("text.txt"), Charset.defaultCharset());
Copy the code
Create flows through functions
There are two static methods in java.util.stream. stream to generate streams from functions, They are Stream
generate(Supplier
s) and Stream
iterate(final T seed, final UnaryOperator
f) :
// iteartor
Stream.iterate(0, n -> n + 2).limit(51).forEach(System.out::println);
// generate
Stream.generate(() -> "Hello Man!").limit(10).forEach(System.out::println);
Copy the code
The first method prints all even numbers up to 100, and the second prints 10 Hello Man! . “It is worth noting that both methods generate streams that are infinite and have no fixed size and can be evaluated indefinitely.” In the code above we used limit() to avoid printing an infinite number of values.
Generally, iterate() is used to generate a series of values, such as dates 10 days after the current time:
Stream.iterate(LocalDate.now(), date -> date.plusDays(1)).limit(10).forEach(System.out::println);
Copy the code
The generate() method is used to generate some random number, such as 10 UUID:
Stream.generate(() -> UUID.randomUUID().toString()).limit(10).forEach(System.out::println);
Copy the code
Use the stream
The Stream interface contains a number of methods for the Stream operation, which are:
filter()
: convection element filtrationmap()
: maps elements of a stream to another typedistinct()
: Removes duplicate elements from the streamsorted()
: Element ordering of convectionforEach()
: Performs an operation on each element in the flowpeek()
And:forEach()
The method has a similar effect, except that it returns a new stream, whileforEach()
There is no returnlimit()
: intercepts the first few elements of the streamskip()
Skip the first few elements in the streamtoArray()
: Converts a stream to an arrayreduce()
: The element reduction operation in the flow, which combines each element to form a new valuecollect()
: summary operation of flow, such as outputList
A collection ofanyMatch()
: matches elements in the streamallMatch()
andnoneMatch()
methodsfindFirst()
: Finds the first element, and so onfindAny()
methodsmax()
: Maximum valuemin()
: Find the minimum valuecount()
Total: o
Here’s how to use each of these methods.
Simple chestnuts:
Stream.of(1.8.5.2.1.0.9.2.0.4.8)
.filter(n -> n > 2) // Filter the elements, keeping the elements greater than 2
.distinct() // Deduplication, similar to DISTINCT in SQL statements
.skip(1) // Skip the first element
.limit(2) // Return the first two elements, similar to SELECT TOP in SQL statements
.sorted() // Sort the results
.forEach(System.out::println);
Copy the code
filter
List<Apple> filterList = appleList.stream().filter(a -> a.getName().equals("Banana")).collect(Collectors.toList());
Copy the code
Sum (reduction)
A reduction operation is a combination of elements in a stream to form a new value. Common reduction operations include summing, finding a maximum or minimum value. The reduce() method is generally used for reduction operations, which can be used together with the Map () method to handle some complicated reduction operations.
// Calculate the total amount
// map -> reduce
BigDecimal totalMoney = appleList.stream().map(Apple::getMoney).reduce(BigDecimal.ZERO, BigDecimal::add);
// Count the number
int sum = appleList.stream().mapToInt(Apple::getNum).sum();
Copy the code
Extract a Bean property
// Take an attribute of the Bean
stuList.stream()
.map(Student::getId).distinct()
.collect(Collectors.toList());
Copy the code
duplicate removal
-
General to heavy
List<Integer> distinctNumbers = numbers.stream().distinct().collect(Collectors.toList());
Copy the code
-
Conditions to heavy
// De-weight according to some property of the Bean
// Define a filter first
public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) {
Map<Object, Boolean> seen = new ConcurrentHashMap<>();
return object -> seen.putIfAbsent(keyExtractor.apply(object), Boolean.TRUE) == null;
}
List<User> distinctUsers = users.stream()
.filter(distinctByKey(User::getName))
.collect(Collectors.toList());
This de-duplication can also be performed on multiple keys, splicing multiple keys into a single Key.private String getGroupingByKey(Person p){
return p.getAge()+"-"+p.getName(); } This idea is also used in the following groups.
/ / or
List<User> unique = list.stream()
.collect(Collectors
.collectingAndThen(Collectors
.toCollection(() -> new TreeSet<>(Comparator.comparing(o -> o.getName()))), ArrayList::new));
/ / the final version
list.stream().collect(Collectors.collectingAndThen(
Collectors.toCollection(() -> new TreeSet<>(
Comparator.comparing(o -> o.getAge() + ";" + o.getName()))), ArrayList::new)).forEach(u -> println(u));
Copy the code
The sorting
List = list.stream().sorted(byNumber).collect(Collectors. ToList ());
Multi-attribute sequencing: Override the Comparator method yourself
@Test public void testSort_with_multipleComparator() throws Exception {
ArrayList<Human> humans = Lists.newArrayList( new Human("tomy", 22), new Human("li", 25) );
Comparator<Human> comparator = (h1, h2) -> {
if (h1.getName().equals(h2.getName())) { return Integer.compare(h1.getAge(), h2.getAge()); } return h1.getName().compareTo(h2.getName()); Copy the code
};
humans.sort(comparator.reversed());
Assert.assertThat("tomy", equalTo(humans.get(0).getName()));Copy the code
}
// General sort // Define a comparator for sorting Comparator<Rule> byNumber = Comparator.comparingInt(Rule::getNumber); // Get the sorted list, filter first, then sort List<Rule> rule = lstRule.stream().filter(s -> s.getCode() == 2).sorted(byNumber).collect(Collectors.toList()); // join sort Comparator<Rule> byNumber = Comparator.comparingInt(Rule::getNumber); Comparator<Rule> byCode = Comparator.comparingInt(Rule::getCode); Comparator<Rule> byNumberAndCode = byNumber.thenComparing(byCode); // byNumberAndCode is a comparator for joint sorting List<Rule> rule = lstRule.stream().filter(s -> s.getCode() == 2).sorted(byNumberAndCode).collect(Collectors.toList()); // Include null when sorting List<User> nList = list.stream() .sorted(Comparator.comparing(User::getCode, Comparator.nullsFirst(String::compareTo))) .collect(Collectors.toList()); List<User> list = minPriceList.stream() .sorted(Comparator.comparing(l -> l.getCreateDate(), Comparator.nullsLast(Date::compareTo))) .collect(Collectors.toList()); List<Food> List =new ArrayList<>(); list.add(new Food(3."aa".2)); list.add(new Food(3."bb".null)); list.add(new Food(2."cc".1)); list.add(new Food(2."dd".15)); List<Food> sortedList = list.stream() .sorted(Comparator.comparing(Food::getPrice, Comparator.nullsLast(Integer::compareTo)).reversed()) .sorted(Comparator.comparing(Food::getId, Comparator.nullsFirst(Integer::compareTo))) .collect(Collectors.toList()); Copy the code
grouping
This method is similar to Mysql’s Group by, but may be simpler than SQL’s complex statements.
Auxiliary POJO
static class Person {
private String name; private int age; private long salary; Person(String name, int age, long salary) { this.name = name; this.age = age; this.salary = salary; } @Override public String toString() { return String.format("Person{name='%s', age=%d, salary=%d}", name, age, salary); } Copy the code
Copy the code
}
-
Group for individual attributes
// The group structure is Map, key is the attribute of the group, value is the member of the group
Map<Integer, List<Person>> peopleByAge = people.stream().collect(Collectors.groupingBy(Person::getAge));
Map<Integer, List<Person>> peopleByAge = people.stream().collect(Collectors.groupingBy(Person::getAge, Collectors.toList()));
Map<Integer, List<Person>> peopleByAge = people.stream().collect(Collectors.groupingBy(p -> p.age, Collectors.mapping((Person p) -> p, toList())));
The returns of the preceding three methods are the same. The value in the Collectors Map can also be set differently based on the second parameter of the Collectors. GroupingBy method, there are more methods, such as summation, maximum value, mean value, and splicing.Copy the code
-
Group for multiple attributes
A: Map<String, Map<Integer, List<Person>>> map = people.stream() .collect(Collectors.groupingBy(Person::getName, Collectors.groupingBy(Person::getAge)); map.get("Fred").get(18);
Class Person {public static class NameAge {public NameAge(String name, int age) {... class Person {public static class NameAge(String name, int age) {... }
Must implement equals and hash function} public NameAge getNameAge() {return new NameAge(name, age); }Copy the code
}
Map<NameAge, List<Person>> map = people.collect(Collectors.groupingBy(Person::getNameAge));
map.get(new NameAge("Fred", 18));Not defining grouping classes can also be used as Apache Commons pair if you use one of these libraries. Map<Pair<String, Integer>, List<Person>> map = people.collect(Collectors.groupingBy(p -> Pair.of(p.getName(), p.getAge()))); map.get(Pair.of("Fred", 18));
Final method: Concatenate multiple fields into a new field and group them using Java8's groupBy. I also used this idea to go to the above. Although this method looks simple and clumsy, it is the most effective way to solve my problem. If multiple fields have more than 2 fields, the Pair above is not very easy to use, and the structure of grouping is complicated. Map<String, List<Person>> peopleBySomeKey = people .collect(Collectors.groupingBy(p -> getGroupingByKey(p), Collectors.mapping((Person p) -> p, toList())));
Copy the code
//write getGroupingByKey() function
private String getGroupingByKey(Person p){
return p.getAge()+"-"+p.getName();
}
-
Grouping sum
// Group sum, key is the group attribute name
Map<String, Long> tt = orgHoldingDatas.stream()
.collect(Collectors.groupingBy(OrgHoldingData::getOrgTypeCode, Collectors.summingLong(OrgHoldingData::getHeldNum)));
Copy the code
-
Grouped together
list.stream()
.sorted(Comparator.comparing(User::getAge))
.collect(Collectors.groupingBy(User::getId))
.forEach((k, v) -> {
Optional<User> csm = v.stream().reduce((v1, v2) -> {
v1.setName(v1.getNameS() + "、" + v2.getName());
list.remove(v2);
return v1;
});
});
/ / or
list.stream().collect(Collectors.groupingBy(User::getName))
.forEach((k, v) -> {
Optional<User> sum = v.stream().reduce((v1, v2) -> { / / merge
v1.setNum(v1.getNum() + v2.getNum());
v1.setPct(v1.getPct() + v2.getPct());
return v1;
});
if(sum.isPresent()) { items.add(sum.get()); }});Copy the code
-
Custom grouping
Map<String, List<Fruit>> map = list.stream()
.collect(Collectors.groupingBy((Function<Fruit, String>) fruit -> {
String key;
if (fruit.getType().equals("1")) {
key = The word "apple";
} else if (fruit.getType().equals("2")) {
key = "Banana";
} else {
key = "Other";
}
return key;
}, Collectors.toList()));
Copy the code
partition
If you partition it, you only have two parts like part. It’s either this zone or the other zone.
Map<Boolean, List<Student>> map = students.stream()
.collect(Collectors.partitioningBy(student -> student.getScore() > 90));
Copy the code
Use opportunely flatMap
// flatMap should be used. FlatMap () is used to flatten
List<String> reList = list.stream().map(item -> item.split(",")).flatMap(Arrays::stream).distinct()
.collect(Collectors.toList());
Copy the code
reduction
A reduction operation is a combination of elements in a stream to form a new value. Common reduction operations include summing, finding a maximum or minimum value. The reduce() method is generally used for reduction operations, which can be used together with the Map () method to process some complicated reduction operations. It’s kind of like map-Reduce in big data.
Flow statistics
- DoubleSummaryStatistics
- LongSummaryStatistics
- IntSummaryStatistics
The parallel flow
Parallel streams a parallelStream can be obtained using the parallelStream() method of collections. Internally, Java splits the contents of a stream into subsections and then hands them off to multiple threads for parallel processing, thereby passing the burden of work to the other cores of a multicore CPU.
In parallel flows, performance is probably the most important concern. It has been observed that, with proper data structure and processing, parallel flows can indeed outperform normal for loops.
ParallelStream () is essentially based on java7’s fork-join framework implementation, with the default number of threads being the number of host cores.
ParallelStream (parallelStream()) can be replaced by stream(). However, since parallelStream() is parallel, it is important to verify that parallelism is worthwhile (parallelism is not necessarily more efficient than sequential execution) and that it is thread safe before enabling parallelism. These two terms are not guaranteed, so parallelism makes no sense, since results are more important than speed.
ParallelStream parallelStream[1]
pit
- in
Stream.of
In the created stream, the use of the stream can only be operated once, after which a flag bit will be set to trueflow
An error message is displayed during the operation. But for the stream form of a collection, such as list.stream(), there is no problem and multiple operations are possible. parallelStream
A thread safety problem occurs when multiple threads write to a list concurrently. The list data is too small, causing the array to be out of bounds. Thread-safe collection classes are recommended.
[2] Java8 parallelStream parallelStream [3]
The performance test
Java Stream API Performance Testing – CarpenterLee – Blogpark [4]
“reference“
New Java8 features for Streaming data Processing [5]
Java 8 New Features (part 2) : Stream API[6]
Reference
[1]
A simple parallelStream: https://blog.csdn.net/u011001723/article/details/52794455
[2]
Remember a java8 parallelStream a murder case caused by improper use: https://my.oschina.net/7001/blog/1475500
[3]
Java8 ParallelStream of records on pit ali cloud – the cloud community: https://yq.aliyun.com/articles/652718
[4]
Java Stream API performance testing – CarpenterLee – blog garden: https://www.cnblogs.com/CarpenterLee/p/6675568.html
[5]
Java8 streaming data processing of new features: https://www.cnblogs.com/shenlanzhizun/p/6027042.html
[6]
Java 8 new features (2) : Stream API: https://blog.csdn.net/lw900925/article/details/78921657
This article is formatted using MDNICE