preface
Data deduplication is a common application scenario in projects, and may be asked in an interview, how many array deduplication methods do you know?
In actual combat
Use extra space for de-weighting
List<String> list = Arrays.asList("java"."html"."js"."sql"."java");
@Test
public void test8(a) {
List<String> newList = new ArrayList<>();
for (String el : list) {
if(! newList.contains(el)) { newList.add(el); } } System.out.println(newList); }Copy the code
Advantages: Simple implementation, and additional operations can be added in the loop judgment process
Disadvantages: requires an extra space size array, wasted space, multi-line code implementation
Set sets are automatically deduplicated
List<String> list = Arrays.asList("java"."html"."js"."sql"."java");
@Test
public void test8(a) {
Set<String> set = new HashSet<>(list);
List<String> newList = new ArrayList<>(set);
System.out.println(newList);
}
Copy the code
Advantages: It is easier to take advantage of the non-repeating feature of Set Set
Cons: Extra space, need to switch back and forth, less elegant
Java8 One-click Deduplication (DISTINCT)
List<String> list2 = Arrays.asList("java"."html"."js"."sql"."java");
@Test
public void test9(a) {
// Java8 stream-api de-gravity
list2.stream()
.distinct()
.forEach(System.out::println);
}
Copy the code
Advantages: New java8 features, easy streaming (just like SQL), elegant code
Disadvantages: No additional operations in the process of de-weighting
Distinct Indicates a deletion notice
If it’s so easy to use, how do you eliminate distinct? If it’s a reference type, is it by address or by some field?
-
Distinct Example code for repealing failures
List<Employee> list = Arrays.asList( new Employee( "Xiao Ming".18), new Employee( "Daisy".38), new Employee( "Zhang".6), new Employee( "Xiao Ming".18));class Employee { private String name; private Integer age; public Employee(String name, Integer age) { this.name = name; this.age = age; } @Override public String toString(a) { return "Employee{" + "name='" + name + '\' ' + ", age=" + age + '} '; }}@Test public void test9(a) { list.stream() .distinct() .forEach(System.out::println); } Copy the code
Console output:
Employee{name='Ming', age=18} Employee{name='Daisy', age=38} Employee{name='Joe', age=6} Employee{name='Ming', age=18} Copy the code
Result: Using a custom object distinct cannot be de-duplicated. Does distinct not work on reference types? But the String is also a reference, but it can also be de-duplicated. Is there any difference between a custom object Employee and a String? That’s right, the hashcode() and equals() methods, which String overrides but the custom Employee doesn’t.
Note: All distinct sets involving maps and sets, or Java8 distinct sets, are de-weighted using hashCode () and equals, so be careful to override both methods
-
Modify the test
class Employee { private String name; privateInteger age; .// Override equals and hashcode methods @Override public boolean equals(Object o) { if (this == o) return true; if (o == null|| getClass() ! = o.getClass())return false; Employee employee = (Employee) o; return Objects.equals(name, employee.name) && Objects.equals(age, employee.age); } @Override public int hashCode(a) { returnObjects.hash(name, age); }... }@Test public void test9(a) { list.stream() .distinct() .forEach(System.out::println); } Copy the code
Console output:
Employee{name='Ming', age=18} Employee{name='Daisy', age=38} Employee{name='Joe', age=6} Copy the code
Overrides hashcode and equals to do this.
A small summary
Distinct de-weighting of streams, whether using a Set Set or a new Java8 feature, is done using a custom policy for HashCode () and Equals ().