I. Origin of the problem

In practical development, we often encounter such a difficulty: a collection container has many duplicate objects, the objects in the collection have no primary key, but according to the business needs, we actually need to filter out the non-duplicate objects according to the conditions.

The more violent approach is to go through a two-level loop based on business requirements, where elements that are not duplicated are added to the new collection and elements that are already in the new collection are skipped.

Create an entity object PenBean as follows:

Public class PenBean {/** type */ private String type; /** color */ private String color; / /... Public PenBean(String type, String color) {this.type = type; this.color = color; } @Override public String toString() { return "PenBean{" + "type='" + type + '\'' + ", color='" + color + '\'' + '}'; }}Copy the code

Test demo as follows:

Public static void main(String[] args) {public static void main(String[] args) {List<PenBean> penBeanList = new ArrayList<PenBean>(); Penbeanlist. add(new PenBean(" pencil ","black")); Penbeanlist. add(new PenBean(" pencil ","white")); Penbeanlist. add(new PenBean(" pencil ","black")); Penbeanlist. add(new PenBean(" neutral pen ","white")); Penbeanlist. add(new PenBean(" neutral pen ","white")); List<PenBean> newPenBeanList = new ArrayList<PenBean>(); For (PenBean PenBean: penBeanList) {if(newPenBeanList.isEmpty()){newpenBeanlist.add (PenBean); }else{ boolean isSame = false; for (PenBean newPenBean : NewPenBeanList) {// If the new set contains elements, Skip if(penbean.getType ().equals(newPenbean.getType ()) &&penbean.getColor ().equals(newPenbean.getColor ())){isSame = true; break; } } if(! isSame){ newPenBeanList.add(penBean); }}} / / output System. Out. The println (" = = = = = = = = = = = = = = = new data "); for (PenBean penBean : newPenBeanList) { System.out.println(penBean.toString()); }}Copy the code

Output result:

= = = = = = = = = = = = = = = new data PenBean {type = 'pencil, color =' black '} PenBean {type = 'pencil, color =' white '} PenBean {type = 'neutral pen, color='white'}Copy the code

When dealing with objects of array type, you can use this method to de-duplicate array elements to filter out arrays that do not contain duplicate elements.

Is there a simpler way to write it?

Contains () in the List!

Second, use the contains method in the list to remove weights

Before using contains(), we must override equals() on the PenBean class. Why do we do this? More on that later!

We’ll start by overriding equals() in the PenBean class as follows:

@Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() ! = o.getClass()) return false; PenBean penBean = (PenBean) o; // Return true objects.equals (type, penbean.type) && objects.equals (color, penbean.color); }Copy the code

Modify the test demo as follows:

Public static void main(String[] args) {List<PenBean> penBeanList = new ArrayList<PenBean>(); Penbeanlist. add(new PenBean(" pencil ","black")); Penbeanlist. add(new PenBean(" pencil ","white")); Penbeanlist. add(new PenBean(" pencil ","black")); Penbeanlist. add(new PenBean(" neutral pen ","white")); Penbeanlist. add(new PenBean(" neutral pen ","white")); List<PenBean> newPenBeanList = new ArrayList<PenBean>(); Contain (PenBean: penBeanList) {for (PenBean: penBeanList) {if(! newPenBeanList.contains(penBean)){ newPenBeanList.add(penBean); }} / / output System. Out. Println (" = = = = = = = = = = = = = = = new data "); for (PenBean penBean : newPenBeanList) { System.out.println(penBean.toString()); }}Copy the code

The following output is displayed:

= = = = = = = = = = = = = = = new data PenBean {type = 'pencil, color =' black '} PenBean {type = 'pencil, color =' white '} PenBean {type = 'neutral pen, color='white'}Copy the code

If the PenBean object does not override equals(), contains() methods are false! The new data is the same as the source data and does not serve our purpose of removing duplicate elements **

So how does contains() tell you that there are identical elements in a set?

We open the ArrayList contains() method, source code:

public boolean contains(Object o) {
    return indexOf(o) >= 0;
}
Copy the code

IndexOf (o) = indexOf(o) = indexOf(o)

public int indexOf(Object o) { if (o == null) { for (int i = 0; i < size; i++) if (elementData[i]==null) return i; } else { for (int i = 0; i < size; If (o.equals(elementData[I])) return I; } return -1; }Copy the code

At this point, it is clear that if the object passed in is NULL, the for loop checks whether the element in the array is null and returns the subscript if it is; If the object passed is not null, the for loop checks whether it has the same element through the equals() method of the object, and returns the subscript if it does!

If it is an array, it must be greater than 0, otherwise -1!

This is why the contains() method is used in the List, and objects need to override equals()!

In Java 8, redo operations are removed

Of course, some friends may think of JDK1.8 streaming, such as JDK1.8 collection elements to rewrite as follows:

Public static void main(String[] args) {List<PenBean> penBeanList = new ArrayList<PenBean>(); Penbeanlist. add(new PenBean(" pencil ","black")); Penbeanlist. add(new PenBean(" pencil ","white")); Penbeanlist. add(new PenBean(" pencil ","black")); Penbeanlist. add(new PenBean(" neutral pen ","white")); Penbeanlist. add(new PenBean(" neutral pen ","white")); NewPenBeanList = penbeanlist.stream ().distinct().collect(Collectors. ToList ()); // Use the new Java8 stream feature to delete lists. System.out.println("========= new data ======"); for (PenBean penBean : newPenBeanList) { System.out.println(penBean.toString()); }}Copy the code

Distinct () uses hashCode() and equals() to get different elements, so using this notation, Object needs to override the **hashCode() and equals() methods! 支那

Rewrite the hashCode() method on the PenBean object as follows:

@Override
public int hashCode() {
    return Objects.hash(type, color);
}
Copy the code

When you run the test demo, the result is as follows:

= = = = = = = = = = = = = = = new data PenBean {type = 'pencil, color =' black '} PenBean {type = 'pencil, color =' white '} PenBean {type = 'neutral pen, color='white'}Copy the code

Can achieve set elements to redo the operation!

So why isn’t it overridden when we use String objects as collection elements?

Because Java String native class, has been rewritten, source code as follows:

public final class String implements java.io.Serializable, Comparable<String>, CharSequence { @Override public boolean equals(Object anObject) { if (this == anObject) { return true; } if (anObject instanceof String) { String anotherString = (String)anObject; int n = value.length; if (n == anotherString.value.length) { char v1[] = value; char v2[] = anotherString.value; int i = 0; while (n-- ! = 0) { if (v1[i] ! = v2[i]) return false; i++; } return true; } } return false; } @Override public int hashCode() { int h = hash; if (h == 0 && value.length > 0) { char val[] = value; for (int i = 0; i < value.length; i++) { h = 31 * h + val[i]; } hash = h; } return h; }}Copy the code

Delete HashSet

In the share above, we introduced the collection de-redo operation of List! HashSet can be used to delete elements.

Indeed, the HashSet collection naturally supports elements that do not repeat!

The practice code is as follows!

Create an Object PenBean and override the equals() and hashCode() methods of Object as follows:

Public class PenBean {/** type */ private String type; /** color */ private String color; / /... Public PenBean(String type, String color) {this.type = type; this.color = color; } @Override public String toString() { return "PenBean{" + "type='" + type + '\'' + ", color='" + color + '\'' + '}'; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() ! = o.getClass()) return false; PenBean penBean = (PenBean) o; // Return true objects.equals (type, penbean.type) && objects.equals (color, penbean.color); } @Override public int hashCode() { return Objects.hash(type, color); }}Copy the code

Create a test demo as follows:

Public static void main(String[] args) {List<PenBean> penBeanList = new ArrayList<PenBean>(); Penbeanlist. add(new PenBean(" pencil ","black")); Penbeanlist. add(new PenBean(" pencil ","white")); Penbeanlist. add(new PenBean(" pencil ","black")); Penbeanlist. add(new PenBean(" neutral pen ","white")); Penbeanlist. add(new PenBean(" neutral pen ","white")); List<PenBean> newPenBeanList = new ArrayList<PenBean>(); //set <PenBean> set = new HashSet<>(penBeanList); newPenBeanList.addAll(set); System.out.println("========= new data ======"); for (PenBean penBean : newPenBeanList) { System.out.println(penBean.toString()); }}Copy the code

The following output is displayed:

= = = = = = = = = = = = = = = new data PenBean {type = 'pencil, color =' white '} PenBean {type = 'pencil, color =' black '} PenBean {type = 'neutral pen, color='white'}Copy the code

Very well, the new collection is returned with no duplicate elements!

So how does a HashSet work?

Open the HashSet source and look at the constructor we passed in as follows:

public HashSet(Collection<? extends E> c) {
    map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
    addAll(c);
}
Copy the code

Obviously, a HashMap object is created first, and then the addAll() method is called. Keep reading!

public boolean addAll(Collection<? extends E> c) {
    boolean modified = false;
    for (E e : c)
        if (add(e))
            modified = true;
    return modified;
}
Copy the code

The add() method is called as follows:

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}
Copy the code

Insert a HashMap Object, where PRESENT is a new Object() constant!

private static final Object PRESENT = new Object();
Copy the code

It’s pretty clear at this point that adding elements to a HashSet is the same thing as adding elements to a HashSet

Map<Object,Object> map = new HashMap<Object,Object>(); map.put(e,new Object); //e represents the element to be insertedCopy the code

The inserted element e is the key in the HashMap!

We know that **HashMap uses equals() and hashCode() to determine whether the inserted key is the same key. Therefore, when we overwrite PenBean’s equals() and hashCode() to ensure that the inserted key is the same key. You can achieve the purpose of element weight!

Finally, we wrap the deduplicated collection HashSet with the addAll() method in ArrayList to get the data we want without duplicate elements.