preface

Data deduplication is a common application scenario in projects, and may be asked in an interview, how many array deduplication methods do you know?

In actual combat

Use extra space for de-weighting
List<String> list = Arrays.asList("java"."html"."js"."sql"."java");

@Test
public void test8(a) {
		List<String> newList = new ArrayList<>();
		for (String el : list) {
				if(! newList.contains(el)) { newList.add(el); } } System.out.println(newList); }Copy the code

Advantages: Simple implementation, and additional operations can be added in the loop judgment process

Disadvantages: requires an extra space size array, wasted space, multi-line code implementation

Set sets are automatically deduplicated
List<String> list = Arrays.asList("java"."html"."js"."sql"."java");

@Test
public void test8(a) {
  	Set<String> set = new HashSet<>(list);
  	List<String> newList = new ArrayList<>(set);

  	System.out.println(newList);
}
Copy the code

Advantages: It is easier to take advantage of the non-repeating feature of Set Set

Cons: Extra space, need to switch back and forth, less elegant

Java8 One-click Deduplication (DISTINCT)
List<String> list2 = Arrays.asList("java"."html"."js"."sql"."java");

 @Test
 public void test9(a) {
    // Java8 stream-api de-gravity
 		list2.stream()
 					.distinct()
 					.forEach(System.out::println);
 }
Copy the code

Advantages: New java8 features, easy streaming (just like SQL), elegant code

Disadvantages: No additional operations in the process of de-weighting

Distinct Indicates a deletion notice

If it’s so easy to use, how do you eliminate distinct? If it’s a reference type, is it by address or by some field?

  • Distinct Example code for repealing failures

    List<Employee> list = Arrays.asList(
                new Employee( "Xiao Ming".18),
                new Employee( "Daisy".38),
                new Employee( "Zhang".6),
                new Employee( "Xiao Ming".18));class Employee {
        private String name;
        private Integer age;
    
        public Employee(String name, Integer age) {
            this.name = name;
            this.age = age;
        }
    
        @Override
        public String toString(a) {
            return "Employee{" +
                    "name='" + name + '\' ' +
                    ", age=" + age +
                    '} '; }}@Test
    public void test9(a) {
        list.stream()
          .distinct()
          .forEach(System.out::println);
    }
    Copy the code

    Console output:

    Employee{name='Ming', age=18}
    Employee{name='Daisy', age=38}
    Employee{name='Joe', age=6}
    Employee{name='Ming', age=18}
    Copy the code

    Result: Using a custom object distinct cannot be de-duplicated. Does distinct not work on reference types? But the String is also a reference, but it can also be de-duplicated. Is there any difference between a custom object Employee and a String? That’s right, the hashcode() and equals() methods, which String overrides but the custom Employee doesn’t.

    Note: All distinct sets involving maps and sets, or Java8 distinct sets, are de-weighted using hashCode () and equals, so be careful to override both methods

  • Modify the test

    class Employee {
        private String name;
        privateInteger age; .// Override equals and hashcode methods
        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null|| getClass() ! = o.getClass())return false;
            Employee employee = (Employee) o;
            return Objects.equals(name, employee.name) &&
              Objects.equals(age, employee.age);
        }
    
        @Override
        public int hashCode(a) {
            returnObjects.hash(name, age); }... }@Test
    public void test9(a) {
        list.stream()
          .distinct()
          .forEach(System.out::println);
    }
    Copy the code

    Console output:

    Employee{name='Ming', age=18}
    Employee{name='Daisy', age=38}
    Employee{name='Joe', age=6}
    Copy the code

    Overrides hashcode and equals to do this.

A small summary

Distinct de-weighting of streams, whether using a Set Set or a new Java8 feature, is done using a custom policy for HashCode () and Equals ().