Prerequisites for the management system (6): Java large data tree directory generation

This is the 4th day of my participation in the August More Text Challenge

preface

After importing all kinds of data with EasyExcel and generating corresponding relations into the library, the next problem is to find out their directory relations. And that’s where the surprises come in, so this article has emerged to document the development process.

The background,

The hierarchical data of a certain power station is shown in the figure. These directory relationships are stored in a relational directory, so the data in this directory table is about 36W.

Before the idea is, to find all the list of data, and then find out the parent directory node, incoming, and list in the parent directory node array traversal step by step, the HTML code is as follows, but in the face of the data directory is helpless, while using the recursive effect execution time is can’t imagine, there will be dozens of minutes of execution. Five levels of recursion, and 36W data to find the target value. The time complexity is 36 w to the fifth. That’s about 168 billion cycles.

private List<WiringPointVO> treePoint(List<WiringPointVO> wiringPointVOList) { List<WiringPointVO> wiringPointVOS = new ArrayList<>(); for (int i = 0; i < wiringPointVOList.size(); i++) { WiringPointVO wiringPointVO = wiringPointVOList.get(i); if (0 ! = wiringPointVO.getParentId().intValue()) { continue; } WiringPointVO wiringPointVO1 = new WiringPointVO(); BeanUtil.copyProperties(wiringPointVO, wiringPointVO1); wiringPointVO1.setChild(subPoint(wiringPointVO, wiringPointVOList)); wiringPointVOS.add(wiringPointVO1); } return wiringPointVOS; }Copy the code

 private List<WiringPointVO> subPoint(WiringPointVO parentEquipment, List<WiringPointVO> wiringPointVOList) {
        List<WiringPointVO> list = new ArrayList<>();
        for (int i = 0; i < wiringPointVOList.size(); i++) {
            WiringPointVO wiringPointVO = wiringPointVOList.get(i);
            if (wiringPointVO.getParentId().equals(parentEquipment.getId())) {
                WiringPointVO wiringPointVO1 = new WiringPointVO();
                BeanUtil.copyProperties(wiringPointVO, wiringPointVO1);
                wiringPointVO1.setChild(subPoint(wiringPointVO, wiringPointVOList));
                list.add(wiringPointVO1);
            }
        }
        return list;
    }
Copy the code

Second, the solution

In the face of this problem, first of all, we need to determine which step is time-consuming. In addition to SQL optimization of 36W data, the most critical thing is how to assemble the directory with large data volume.

Recursion makes the code more brief, but because there is no data processing, the problem will be the need to find several data values from 36W data, and repeated, so, the idea is to group the directory. According to the type of equipment, it can be divided into five groups: box transformer, convergence, inverter, branch and photovoltaic.

Although the five groups of contents include all incoming data, and we only need to search one incoming data, the granularity of these five groups of contents cannot be further divided.

Having identified these five sets of data, we abandoned recursion and manually assembled the catalog data.

The steps are as follows:

1. Find A box change out of 61 box changes according to the incoming line ID. At this time, child is not assigned
2. A box changes are traversed and B bus bus is found out from 522 bus bus. At this time, child is not assigned
3. Iterate over B converters and find C inverters out of 2014 inverters. At this time, child is not assigned
4. C inverters are traversed and D branches are found out from 15579 branches. At this time, child is not assigned
5. Traverse D branches, find E pv from 342738 pv, and assign the corresponding PV list to THE child of D
6. Assemble inverters, traverse C inverters, find the id value of parentId = C from D branches, and then assign a value to the child of C
7. Assemble the bus, traverse B bus, find the id value of parentId = B from C inverter and then assign the value of child of B
ParentId =A, parentId =A, parentId =A, parentId =A, parentId =A, parentId =A, parentId =A, parentId =A
9, return the list of boxes that were finally assembled, and assign the child to the incoming line to complete the assembly of the directory

Step 5 is the most time consuming, requiring D * 342738 cycles. The value of D is estimated to be about 7 billion cycles for 2000 lines. This step has been condensed to a minimum, so the loop can be multithreaded.

Three, specific code

3.1 Service Code

Private WiringPointVO subPoint2(WiringPointVO parentEquipment, List<WiringPointVO> wiringPointVOList) { Recursion is inefficient. List<WiringPointVO> xbBig = new ArrayList<>(); List<WiringPointVO> hlBig = new ArrayList<>(); List<WiringPointVO> nbBig = new ArrayList<>(); List<WiringPointVO> zlBig = new ArrayList<>(); List<WiringPointVO> gfBig = new ArrayList<>(); List<WiringPointVO> xbList = new ArrayList<>(20); List<WiringPointVO> hlList = new ArrayList<>(100); List<WiringPointVO> nbList = new ArrayList<>(500); List<WiringPointVO> zlList = new ArrayList<>(5000); Map<String, List<WiringPointVO>> collect = wiringPointVOList.stream().filter(i->i.getType()! =null).collect(Collectors.groupingBy(WiringPoint::getType)); for (String s : collect.keySet()) { if(s.equals(WmConst.DEV_TYPE_CONST_1.getName())){ xbBig=collect.get(s); } if(s.equals(WmConst.DEV_TYPE_CONST_2.getName())){ hlBig=collect.get(s); } if(s.equals(WmConst.DEV_TYPE_CONST_3.getName())){ nbBig=collect.get(s); } if(s.equals(WmConst.DEV_TYPE_CONST_4.getName())){ zlBig=collect.get(s); } if(s.equals(WmConst.DEV_TYPE_CONST_5.getName())){ gfBig=collect.get(s); } } Long start = System.currentTimeMillis(); ForEach (I ->{if(i.getParentid ().equals(parentEquipment.getid ()))){xblist.add (I); }}); Log.info (" Box loading completed, time {}...") ,System.currentTimeMillis()-start); WiringPointVO xb: WiringPointVO xb: WiringPointVO xb: xbList) { hlBig.forEach(i->{ if(i.getParentId().equals(xb.getId())){ hlList.add(i); }}); } log.info(" Bus box loading finished.. Take {}..." ,System.currentTimeMillis()-start); WiringPointVO hl: null for (WiringPointVO hl: hlList) { nbBig.forEach(i->{ if(i.getParentId().equals(hl.getId())){ nbList.add(i); }}); } log.info(" Inverter loading finished.. Take {}..." ,System.currentTimeMillis()-start); For (WiringPointVO nb: nbList) {zlbig.foreach (I ->{if(i.geparentid ().equals(nb.getid ()))){zllist.add (I); }}); } log.info(" Branch load completed.. Take {}..." ,System.currentTimeMillis()-start); WiringPointVO zL: zlList) {List<WiringPointVO> finalGfBig = gfBig; pool.execute(new Runnable() { @Override public void run() { ArrayList<WiringPointVO> wiringPointVOS = new ArrayList<>();  finalGfBig.forEach(i->{ if(i.getParentId().equals(zl.getId())){ wiringPointVOS.add(i); }}); zl.setChild(wiringPointVOS); }}); } log.info(" Pv loading finished.. Take {}..." ,System.currentTimeMillis()-start); // Map<Integer, List<WiringPointVO>> zlmap = zlList.stream().collect(Collectors.groupingBy(WiringPoint::getParentId)); nbList.forEach(i->{ if(ObjectUtil.isNotNull(zlmap.get(i.getId()))){ i.setChild(zlmap.get(i.getId())); }}); Log.info (" Inverter assembly time :{}...") ,System.currentTimeMillis()-start); // 2, Map<Integer, List<WiringPointVO>> nbmap = nbList.stream().collect(Collectors.groupingBy(WiringPoint::getParentId)); hlList.forEach(i->{ if(ObjectUtil.isNotNull(nbmap.get(i.getId()))){ i.setChild(nbmap.get(i.getId())); }}); Log.info (" Assembly time :{}...") ,System.currentTimeMillis()-start); // add Map<Integer List<WiringPointVO>> hlmap = hlList.stream().collect(Collectors.groupingBy(WiringPoint::getParentId)); xbList.forEach(i->{ if(ObjectUtil.isNotNull(hlmap.get(i.getId()))){ i.setChild(hlmap.get(i.getId())); }}); Log.info (" Last time :{}...") ,System.currentTimeMillis()-start); parentEquipment.setChild(xbList); return parentEquipment; }Copy the code

3.2 the thread pool

/** * @author xiao lei * @Date 2020/11/6 * @module */ public class CustomThreadPoolExecutor { private ThreadPoolExecutor  poolExecutor = null; /** * thread pool initialization method * <p> * corePoolSize corePoolSize ----1 * maximumPoolSize maximumPoolSize ----3 * keepAliveTime Maximum lifetime of idle threads in a thread pool that exceeds the number of corePoolSize ----30+ TimeUnit * TimeUnit keepAliveTime TimeUnit ---- timeunit. MINUTES * workQueue Blocking queue ----new ArrayBlockingQueue<Runnable>(5)====5 capacity blocking queue * threadFactory new threadFactory ----new CustomThreadFactory()==== CustomThreadFactory * RejectedExecutionHandler If the number of submitted tasks exceeds maxmumPoolSize+workQueue * = sleep(100) if the number of submitted tasks exceeds maxmumPoolSize+workQueue Public void init() {poolExecutor = new ThreadPoolExecutor(7, 7, 30); TimeUnit.MINUTES, new ArrayBlockingQueue<Runnable>(300), new CustomThreadFactory(), new CustomRejectedExecutionHandler()); } public void destory() { if (poolExecutor ! = null) { poolExecutor.shutdownNow(); } } public ExecutorService getCustomThreadPoolExecutor() { return this.poolExecutor; } private class CustomThreadFactory implements ThreadFactory { private AtomicInteger count = new AtomicInteger(0); @Override public Thread newThread(Runnable r) { Thread t = new Thread(r); String threadName = CustomThreadPoolExecutor.class.getSimpleName() + count.addAndGet(1); System.out.println(threadName); t.setName(threadName); return t; } } private class CustomRejectedExecutionHandler implements RejectedExecutionHandler { @Override public void RejectedExecution (Runnable r, ThreadPoolExecutor executor) {try { Change the offer from BlockingQueue to put blocking method executor.getQueue().put(r); } catch (InterruptedException e) { e.printStackTrace(); }}}}Copy the code

public class ThreadPoolConfig { public static ExecutorService initPool() { CustomThreadPoolExecutor exec = new CustomThreadPoolExecutor(); // initialize exec.init(); ExecutorService pool = exec.getCustomThreadPoolExecutor(); return pool; }}Copy the code

Thread pool core threads have to be set according to the number of computer cores, which are divided into IO and computational CPU cores.

The problem is computational. The normal IO thread is set to twice the number of cores. For computational threads, set the number of cores to -1.

From a few dozen minutes for the original code, to more than 20 seconds for a single thread, and 4-5 seconds for multiple threads. At this point, queries from the database alone have been optimized to a certain extent, without adding redis cache.

So that’s one way to solve this problem,

Prerequisites for the management system (6): Java large data tree directory generation

preface

The background,

Second, the solution

Three, specific code

3.1 Service Code

3.2 the thread pool

Related Posts

In the face of burst traffic: system and software layer traffic limiting methods

Spring Cloud: Micro-service Environment Construction (Alibaba)

SQLyog failed to connect to MySQL and 1045 error occurred