1. An overview of the

In recent work, some businesses need to use Fork/join optimization. In this paper, I have implemented the optimization in the form of code landing based on actual cases.

What is Fork/Join?

The Fork/Join framework is a multithreaded processor that implements the ExecutorService interface. It can divide a large task into several small tasks for concurrent execution, make full use of available resources, and improve application execution efficiency.


2. Service logic

Background Excel batch import students, import the need to compare with the data in the database, determine whether username is the same, if the same, do not import. (PS: In actual business, other data may be removed, and the logic is the same. Student data is taken as a case in this paper.)

Known to 3.

Mysql > alter TABLE student

1W pieces of data in Excel

3. The train of thought

3.1 The first version of this thinking

In order to reduce the pressure on the database, we

The first step is to read all the data into Java memory



The second step is to remove the excel data itself


The third step is to iterate through the Excel data, and check whether each username is equal to dbData

In the case of 10W data of database and 1W data of Excel, the screening time is 8049ms

3.2 Step 2 (Join Fork/join)

In combination with the business, my idea is to choose the optimization in the third step above, and divide the previous 1W data screening into two tasks of 5000 at the same time

This roughly doubles the efficiency of the third step, other factors being equal

implementation

The first step is to write a return value of the task class, well defined

THRESHOLD_NUM (number of data processed by a single thread)

Start,end

The second step is to split the logic and the task in the compute implementation

When the number of data processed is less than or equal to that processed by a single thread, deduplication is performed

When the processing data is less than or equal to the value of the data processed by a single thread, the task splitting and result set merging are required (PS: recursive calls).

Step 3 Business layer invocation

In the case of 10W data of database and 1W data of Excel, it takes 4319ms to filter

Efficiency was nearly doubled

Code 4.

package com.itbbs.service; import com.itbbs.ArrayListUtil; import com.itbbs.DataUtil; import com.itbbs.pojo.Student; import com.itbbs.task.DistinctTask; import java.util.ArrayList; import java.util.Comparator; import java.util.List; import java.util.concurrent.ForkJoinPool; import java.util.stream.Collectors; Public void importStudent(List<Student>) public void importStudent(List<Student>) DbData,List<Student> excelData){// Excel = exceldata.stream () Filter (ArrayListUtil. DistinctByKey (Student: : getUsername)) / / according to the username to heavy. Collect (Collectors. ToList ()); List<Student> repetitionData = new ArrayList<>(); long s = System.currentTimeMillis(); // Iterate over all excel datafor(Student data : excelData) { String username = data.getUsername(); // Check whether it exists in dbDataif(! Arraylistutil. isInclude(dbData, username)) {// If not present add repetitiondata.add (data); } } long e = System.currentTimeMillis(); System.out.println("Filter time :"+(e-s) +"ms"); // repetitionData = repetitiondata.stream () .sorted(Comparator.comparing(Student::getUsername))// Sort by username. Collect (Collectors. ToList ()); // repetitionData.forEach(p-> System.out.println(p.getUsername())); Public void importStudent2(List<Student> dbData,List<Student> excelData){// Delete excelData = ExcelData. Stream (). The filter (ArrayListUtil distinctByKey (Student: : getUsername)) / / according to the username to heavy .collect(Collectors.toList()); long s = System.currentTimeMillis(); ForkJoinPool FJP = new ForkJoinPool(); DistinctTask task = new DistinctTask(0,excelData.size(),dbData,excelData); List<Student> repetitionData = fjp.invoke(task); long e = System.currentTimeMillis(); System.out.println("Filter time :"+(e-s) +"ms"); // repetitionData = repetitiondata.stream () .sorted(Comparator.comparing(Student::getUsername))// Sort by username. Collect (Collectors. ToList ()); // repetitionData.forEach(p-> System.out.println(p.getUsername())); } public static void main(String[] args) {List<Student> dbData = datautil.getdbData (); List<Student> excelData = datautil.getExcelData (); new StudentService().importStudent(dbData,excelData); new StudentService().importStudent2(dbData,excelData); }}Copy the code

package com.itbbs.task; import com.itbbs.ArrayListUtil; import com.itbbs.pojo.Student; import java.util.ArrayList; import java.util.List; import java.util.concurrent.RecursiveTask; Public class DistinctTask extends RecursiveTask<List<Student>> {// Single task process data private static final int THRESHOLD_NUM = 5000; // private int start, end; Private List<Student> dbData; private List<Student> excelData; public DistinctTask(int start, int end, List<Student> dbData, List<Student> excelData) { this.start = start; this.end = end; this.dbData = dbData; this.excelData = excelData; } @Override protected List<Student>computeExcelData = excelData.sublist (start,end); Int size = exceldata.size ();if(size<=THRESHOLD_NUM){// Calculate List<Student> repetitionData = new ArrayList<>(); // Iterate over all excel datafor(Student data : excelData) { String username = data.getUsername(); // Check whether it exists in dbDataif(! Arraylistutil. isInclude(dbData, username)) {// If not present add repetitiondata.add (data); }}return repetitionData;





        }elseInt middle = (start + end) / 2; DistinctTask left = new DistinctTask(start,middle,dbData,excelData); DistinctTask right = new DistinctTask(middle+1,end,dbData,excelData); // execute subtask left.fork(); right.fork(); List<Student> lResult = left.join(); List<Student> rResult = right.join(); Lresult. addAll(rResult);returnlResult; }}}Copy the code

package com.itbbs; import com.itbbs.pojo.Student; import java.util.ArrayList; import java.util.List; import java.util.Random; /** * Data source utility class * TJX */ public class DataUtil {/** ** simulate database * @return
     */
    public static List<Student> getDbData(){
        List<Student> result = new ArrayList<Student>();
        Random random = new Random();
        for (int i = 0; i <100000 ; i++) {
            Student student = new Student();
            student.setUsername(random.nextInt(99)+"");
            result.add(student);
        }
        returnresult; } /** * simulation to get Excel data * @return
     */
    public static List<Student> getExcelData(){
        List<Student> result = new ArrayList<Student>();
        Random random = new Random();
        for (int i = 0; i <10000 ; i++) {
            Student student = new Student();
            student.setUsername(random.nextInt(100000)+"");
            result.add(student);
        }
        returnresult; }}Copy the code

package com.itbbs; import com.itbbs.pojo.Student; import org.apache.commons.lang3.ArrayUtils; import java.util.List; import java.util.Map; import java.util.concurrent.ConcurrentHashMap; import java.util.function.Function; import java.util.function.Predicate; Public class ArrayListUtil {/** ** to duplicate elements * @param keyExtractor * @param <T> * @return	 */	public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) {
        Map<Object, Boolean> map = new ConcurrentHashMap<>();		returnt -> map.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null; } /** * Check whether the list contains targetValue * @param list * @param targetValue * @return
     */
    public static boolean isInclude(List<Student> list, String targetValue){
        returnArrayUtils.contains(list.toArray(),targetValue); }}Copy the code

package com.itbbs.pojo;

public class Student {

    private String username;

    private String password;

    public String getUsername() {
        return username;
    }

    public void setUsername(String username) {
        this.username = username;
    }

    public String getPassword() {
        return password;
    }

    public void setPassword(String password) {
        this.password = password;
    }

    @Override
    public boolean equals(Object obj) {
        if (obj instanceof Student) {
            Student student = (Student) obj;
            return (username.equals(student.username));
        }
        return super.equals(obj);
    }

    @Override
    public int hashCode() {
        Student student = (Student) this;
        returnstudent.username.hashCode(); }}Copy the code