preface
To update again, empty record before writing a demand, welcome big guy to come to advice, light spray 👌
Need to export Excel: itself thought is a simple export, but each line of record file has a column of image URL, need to download all the record line pairs of images, and then compress the entire folder.
Graph TD 1. Generate corresponding Excel___ --> 2. According to the time to create the corresponding date folder # accurately to millisecond _____ - > 3. Excel written to match folder ___ -- - > 4. Multithreaded written to the corresponding Url according to the pictures folder ____ - > 5. Compress the entire folder, packaged into. There comes the zip file - > 6. Interface response Download Url page to download ____
Here only do 4.5. Code description, there is nothing else to say, not more about the code.
Implementation approach
The multithreaded implementation uses a thread pool,Jdk1.8, and sends the CompletableFuture under the package
Step 1: Get the base value
/ / the number of threads
Integer threadNum = 10;
// The number of images to process per thread
int dataNum = imageInfoVos.size() / threadNum;
// Number of write threads
List<Integer> threadS = new ArrayList<>();
for(int i=0; i<threadNum; i++){
threadS.add(i);
}
Copy the code
First of all, we save the Url list of images that need to be downloaded. Multithreaded downloading we need to ensure that the images downloaded by each thread will not be repeated. Therefore, we need to cut and save the SET of Url list according to the rules, so as to ensure that each thread downloads its own task.
// Add text code
threadS.stream().map(item -> CompletableFuture.runAsync(() ->{
List<Image> theadItem = imageInfoVos.subList(dataNum * item,(item+1)==threadNum? imageInfoVos.size():Math.min(dataNum * (item +1 ), imageInfoVos.size()));
threadDownPic(theadItem,item,dirName);
},threadPoolTaskExecutor)).collect(Collectors.toList()).forEach(item ->{
try {
item.get();
}catch (Exception e){
log.error("============ Multithreading down error MSG :{} =============", e.getMessage()); }});Copy the code
I’m going to break it down here
Using CompletableFuture. RunAsync asynchronously, traverse the item
If item=10, the number of threads is 10, then execute 10 times (with thread pool).
/ / using CompletableFuture runAsync asynchronously, traverse the item
// If item=10, if thread count =10, execute 10 times (if thread pool exists)
threadS.stream().map(item -> CompletableFuture.runAsync(() ->{
Copy the code
Rule: Intercepts the list of urls that the thread needs to download from the start to the end of sublist based on the item value
Example: dataNum is the number of downloads to be completed per thread. item=0 dataNum* item(0) =0,Math.min(dataNum * (item + 1 )=100 (item+1)==threadNum? Imageinfovos.size () this time ensures that the last thread handles the last insufficient imageCopy the code
According to the above rules, we can get the image Url that each thread needs to download to ensure that it will not be repeated
// Intercepts the list of urls that the thread needs to download from the start to the end of sublist based on the item value
// Example: dataNum is the number of downloads that need to be completed per thread
Math.min(dataNum * (item + 1)= 0) math.min (dataNum * (item + 1)= 0) math.min (dataNum * (item + 1)= 0
// Follow the above rules to get the image Url that each thread needs to download
// (item+1)==threadNum? Imageinfovos.size () this time ensures that the last thread handles the last insufficient image
List<ImageInfoVo> theadItem = imageInfoVos.subList(dataNum * item,(item+1)==threadNum? imageInfoVos.size():Math.min(dataNum * (item +1 ), imageInfoVos.size()));
// theadItem: image Url item: owning subscript dirName: write path Url
threadDownPic(theadItem,item,dirName);
Copy the code
Because of the asynchronous way in which the execution is performed, this is done until all the threads in the thread pool have finished, and the compressed file step is executed. If the thread pool is not manually assigned, the CompletableFuture defaults to ForkJoinPool.commonPool, which is specified according to the number of cores on the computer.
For example, if I do not specify 7 threads, I will execute the first 7 threads before I continue to create 3 threads to continue to execute the subsequent unfinished tasks
},threadPoolTaskExecutor)).collect(Collectors.toList()).forEach(item ->{
try {
item.get();
}catch (Exception e){
log.error("============ Multithreading down error MSG :{} =============", e.getMessage()); }});Copy the code
The measured
The main code has been written. Is this really efficient? Let me post a few test pictures to illustrate
In fact, this approach does not significantly improve efficiency, of course, this is my native environment test.
The efficiency is determined by the speed of the network, not by the native Cpu and IO, so let’s say 10M bandwidth, one thread at a time, one sequential download at a time, but 10M,10 threads, maybe 1M each, and it makes no difference. The CPU and IO savings from multithreading are negligible relative to network speed, but the bottleneck is network speed.
The optimization points of the interface are as follows: Improve the compression efficiency. You can directly compress the file stream without saving the image locally
At the end
Hydrological not easy, welcome big-boys to discuss and point out deficiencies in ✌