The concurrent.futures module, which provides the ProcessPoolExecutor and ThreadPoolExecutor classes, further abstracts from threading and Multiprocessing.Copy the code

1 / ThreadPoolExecutor thread pool

When ThreadPoolExecutor() constructs an instance, it passes in the [max_workers]() argument to set the maximum number of threads that can run at the same time. It uses the submit() function to submit the task (function name and parameters) that the thread needs to execute to the thread pool. And returns a handle to the task (similar to a file, drawing). Note that [submit()]() does not block, but returns immediately. The done()() method can be used to determine whether the task is finished. The result() method can be used to obtain the return value of the task. The internal code is checked and found that the method is blockedCopy the code
# -*-coding:utf-8 -*- 
from concurrent.futures import ThreadPoolExecutor 
import time The Times parameter is used to simulate the network request time


def f(times) : 
    print("get page {}s finished".format(times)) 
    return times 

# Build a multi-threaded pool
executor = ThreadPoolExecutor(max_workers=2) 

Submit the executed function to the thread pool via the Submit function, which returns immediately without blocking
task1 = executor.submit( f,(3) ) 
task2 = executor.submit( f,(2))The #done method is used to determine whether a task is complete
print(task1.done()) 

The #cancel method is used to cancel a task that cannot be canceled until it has been placed in the thread pool
print(task2.cancel()) 

print(task1.done()) 

The #result method retrieves the execution result of the task
print(task1.result()) 

# the results:
# get page 3s finished 
# get page 2s finished 
# True 
# False 
# True 
# 3
  
Copy the code

2 / ProcessPoolExecutor process pool

There are two task submission modes: <1> Synchronous submission mode: Submit the task, wait for the completion of the task, get the result of the task, and then execute the next task. Advantages: Can be decoupled. Disadvantages: Slow, because you need to wait for the result before executing the next taskCopy the code

  import datetime
  from concurrent.futures import ProcessPoolExecutor
  from threading import current_thread
  import time, random, os
  import requests
  
  def f(name) :
      print('%s %s is running'%(name,os.getpid()))
      #print(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
      
  if __name__ == '__main__':
      process_pool = ProcessPoolExecutor(4) Set the number of processes in the process pool
      for i in range(10) :Synchronous call mode, call and equivalent
          Pass the parameter (task name, parameter) using position or keyword parameter
          obj = process_pool.submit(f,"Process PID:")
          
          The #.result() function returns the result of the process and the result of the return
          # if f() does not return, obj.result() is None
          res = obj.result()
          
      Close the process pool entry and wait for the pool task to finish before executing the main program
      process_pool.shutdown(wait=True) 
      
      print("Main thread")
Copy the code
<2> Asynchronous submission only calls, not equivalent advantages: fast disadvantages: there is couplingCopy the code
    import datetime
    from concurrent.futures import ProcessPoolExecutor
    from threading import current_thread
    import time, random, os
    import requests
    
    def f(name) : 
        print("%s %s is running" %(name,os.getpid())) 
        time.sleep(random.randint(1.3)) 
    
    if __name__ == '__main__': 
        Set the process in the process pool
        process_pool = ProcessPoolExecutor(4)
        
        for i in range(10) :# Asynchronous commit, call only, not equivalent
            process_pool.submit(f,'Process pid:') 
            # Pass parameter (task name, parameter), parameter using position parameter or keyword parameter
        
        Close the process pool entry and wait for the pool task to finish before executing the main program
        process_pool.shutdown( wait=True ) 
        
        print('Main thread')
Copy the code

conclusion

1, the more threads the better, will involve CPU context switch (will save the last record). 2, process than thread consumption of resources, process is equivalent to a factory, there are a lot of people in the factory, the people inside the common enjoyment of welfare resources, a process in the default only one main thread, such as: open the process is a process, which is executed by the thread, the thread is just a process to create more people to work at the same time. 3, in the thread of GIL global unlock: do not allow the CPU scheduling type 4, computational density (CPU intensive) applies to more than 4.1 I/O intensive process suitable for multithreading, who are free to perform 5, threads, thread is the smallest unit of work in the computer, process: the default is the main thread (for work) can coexist multithreaded coroutines and 7: GIL Global interpreter lock: ensures that only one thread is scheduled by the CPU at a timeCopy the code