This is the third day of my participation in the November Gwen Challenge. Check out the details: the last Gwen Challenge 2021

There are many ways to remove weight in the List. I have sorted out and analyzed several commonly used ways to remove weight. If any mistakes or omissions are welcome to correct them.

HashSet

Take advantage of the non-repeating property of the Set element to remove the weight, without retaining the original order. New objects (objects created by new) cannot be directly de-duplicated.

public class Test {
    public static void main(String[] args) {
        // 1. Construct a List
        List<Person> list = new ArrayList<>();
        Person p = new Person("Zhang".20.'male');
        list.add(p);
        list.add(new Person("Small beautiful".20.'woman'));
        list.add(new Person("Small beautiful".20.'woman'));
        / / 2. To weight
        Set<Person> hashSet = new HashSet<>(list);
        List<Person> newList = new ArrayList<>(hashSet);
        System.out.println(newList);// There is no emphasis on Xiao Li}}Copy the code

HashSet + ArrayList

Use the HashSet to determine whether the elements are duplicated, or put them in a new List if they are not. This method preserves the original order after deduplication. New objects (objects created by new) cannot be directly de-duplicated.

Set<Person> hashSet = new HashSet<>();
List<Person> newList = new ArrayList<>();
for (Iterator<Person> iter = list.iterator(); iter.hasNext();) {
    Person element = iter.next();
    if(hashSet.add(element)) { newList.add(element); }}Copy the code

TreeSet

Use TreeSet’s element non-repetition feature to de-duplicate, customizable sorting, default natural sorting.

Set<String> treeSet = new TreeSet<String>(list);
List<String> newList = new ArrayList<>(treeSet);
Copy the code

ArrayList

Using two lists, the original List is iterated over and then the new List is deduplicated by checking for elements from the original List, which preserves the original order. New objects (objects created by new) cannot be directly de-duplicated.

List<String> newList = new ArrayList<>();
for (int i = 0; i < list.size(); i++) {
    if (!newList.contains(list.get(i))) {
        newList.add(list.get(i));
    }
}
Copy the code

The stream java8

Rerunning streams preserves the original order. New objects (objects created by new) cannot be directly de-duplicated.

Note: The stream does not operate on the original collection, so the new collection receives the stream after the operation.

list.stream().distinct().collect(Collectors.toList());
Copy the code

Custom method of entity single attribute de-duplication

The above method cannot be de-weighted based on an attribute of the entity, so it can only be implemented through a custom method. The filter method of the stream is used to redefine the original order.

public class Test {
    public static void main(String[] args) {
        // 1. Construct a List./ / 2. To weight
        List<Person> newList = new ArrayList<>();
        newList = list.stream()
        .filter(distinctByKey(o ->  o.getName() + ";" + o.getAge()))
        .collect(Collectors.toList());
        System.out.println(newList);
    }

    /** * Custom de-duplicate method *@param<T> To be deleted entity *@paramKeyExtractor is de-marked (e.g. O.goetname () + ";" + o.getAge()) *@return* /
    private static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) {
        Map<Object, Boolean> seen = new ConcurrentHashMap<>();
        return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null; }}Copy the code

Stream + TreeSet for entity single attribute

Stream + TreeSet, do not retain the original order, can be customized sort, default natural sort.

List<Person> newList = list.stream().collect(Collectors.collectingAndThen(
    Collectors.toCollection(() -> new TreeSet<>(
        Comparator.comparing(person -> person.getName() + ";" + person.getAge()))), ArrayList::new));Copy the code

conclusion

Stream is the least efficient, taking about five times as long as HashSet + ArrayList.

Suggestion: Not requiring high performance, use stream mode to remove heavy, simple code; For high performance requirements, use HashSet + ArrayList or ArrayList, which has slightly lower performance but is more efficient than HashSet. Need custom sort using TreeSet.

HashSet HashSet + ArrayList TreeSet ArrayList stream
writing simple More difficult to simple More difficult to The most simple
The efficiency of The lower high The lower high The minimum
The order A disorderly The original order Natural ordering The original order The original order