1. Introduction of the process

This is the sixth day of my participation in Gwen Challenge

1.1 Concept of process

Process: Colloquially understood as a running program or software, process is the basic unit of operating system resource allocation.

In real life, a company can be understood as a process. The company provides office resources (computers, office desks and chairs, etc.), and the real work is done by employees, who can be understood as threads.

Note: a program has at least one process, and a process has at least one thread. Multiple processes can perform multiple tasks.

1.2 Process Status

In work, the number of tasks is often larger than the number of CPU cores, that is, some tasks must be executing while others are waiting for the CPU to execute, resulting in different states

  • Ready: Running conditions have been slowed down and are waiting for the CPU to execute
  • Running state: the CPU is performing its function
  • Wait state: Wait for some condition to be satisfied, such as a program to sleep, then wait state

summary

A process has a default thread, the process can create threads, threads are attached to the process, there is no process there is no thread.

2. Use of processes

2.1 Multiple Processes Complete multiple tasks

2.1.1 Importing process modules

import multiprocessing
# or
from multiprocessing import Process
Copy the code

2.1.2 The syntax structure of the Process class is as follows:

""" Process([group [, target [, name [, args [, kwargs]]]]]) group: specifies a Process group. Currently, None can be used. Target: Specifies the name of the target task to execute. Kwargs: to send parameters to a task in dictionary mode """
Copy the code

Common methods for creating instance objects by Process:

  • Start () : Starts the child process instance (creates the child process)
  • Join ([timeout]) : Whether to wait for the child to finish, or how many seconds to wait
  • Terminate () : Terminates the child process immediately, whether the task is complete or not

Common properties of instance objects created by Process:

  • Name: alias of the current Process. The default value is process-n, where N is an integer incrementing from 1
  • Pid: INDICATES the PID (process ID) of the current process.

2.1.3 Multi-process complete multi-task code

import multiprocessing
import time


def run_proc() :
    """ Code to be executed by child process """
    while True:
        print("-- 2 --")
        time.sleep(1)


if __name__=='__main__':
    Create a child process
    sub_process = multiprocessing.Process(target=run_proc)
    # Start the child process
    sub_process.start()
    while True:
        print("-- 1 --")
        time.sleep(1)
Copy the code

Execution Result:

----1----
----2----
----1----
----2----
----1----
----2----
Copy the code

2.1.4 Obtaining the PROCESS PID

import multiprocessing
import time
import os

def work() :
    # Check the current process
    current_process = multiprocessing.current_process()
    print("work:", current_process)
    Get the current process id
    print("Work process id :", current_process.pid, os.getpid())
    Get the parent process id
    print("Work parent process number :", os.getppid())
    for i in range(10) :print("At work....")
        time.sleep(0.2)
        # extension: Kill the corresponding process according to the process number
        os.kill(os.getpid(), 9)


if __name__ == '__main__':

    # Check the current process
    current_process = multiprocessing.current_process()
    print("main:", current_process)
    Get the current process id
    print("Main process number :", current_process.pid)

    Create a child process
    sub_process = multiprocessing.Process(target=work)
    # start process
    sub_process.start()


    The main process prints the message
    for i in range(20) :print("I execute in the main process...")
        time.sleep(0.2)
Copy the code

Execution Result:

Main: <_MainProcess(MainProcess, started)> Number of the main process:9552I'm executing in the main process... I'm executing in the main process... I'm executing in the main process... I'm executing in the main process... work: <Process(Process-1, started)> work5056 5056Work parent process number:9552In the work... I'm executing in the main process... I'm executing in the main process... I'm executing in the main process... I'm executing in the main process...Copy the code

3, process attention points

3.1 Processes do not share global variables

import multiprocessing
import time

Define global variables
my_list = list(a)Write data
def write_data() :
    for i in range(5):
        my_list.append(i)
        time.sleep(0.2)
    print("write_data:", my_list)


# fetch data
def read_data() :
    print(my_list)


if __name__ == '__main__':
    Create a process to write data to
    write_process = multiprocessing.Process(target=write_data)
    read_process = multiprocessing.Process(target=read_data)

    write_process.start()
    The main process waits for the writing process to complete before continuing to execute
    write_process.join()
    read_process.start()
Copy the code

Execution Result:

write_data: [0.1.2.3.4]
read_data: []
Copy the code

Note: creating a child process is actually a copy of the main process. Processes are independent of each other and access different global variables. Therefore, processes do not share global variables

3.2 The main process will wait for all child processes to finish executing the program before exiting

import multiprocessing
import time

Test whether the main process exits after the child process completes execution
def work() :
    for i in range(10) :print("At work...")
        time.sleep(0.2)

if __name__ == '__main__':
    Create a child process
    work_process = multiprocessing.Process(target=work)

    work_process.start()

    Let the main process wait 1 second
    time.sleep(1)
    print("Main process execution is complete.")

    Summary: The main process waits for all child processes to complete before exiting
Copy the code

Execution Result:

In the work... In the work... In the work... In the work... In the work... The main process has finished executing. In the work... In the work... In the work... In the work...Copy the code

3.2.1 Destroy the child process code

import multiprocessing
import time

Test whether the main process exits after the child process completes execution
def work() :
    for i in range(10) :print("At work...")
        time.sleep(0.2)

if __name__ == '__main__':
    Create a child process
    work_process = multiprocessing.Process(target=work)
    # set daemon main process, after the main process exits, the child process is destroyed directly, no longer execute the code in the child process
    # work_process.daemon = True
    work_process.start()

    Let the main process wait 1 second
    time.sleep(1)
    print("Main process execution is complete.")
    Destroy all child processes before the main process exits
    work_process.terminate()
    Summary: The main process waits for all child processes to complete before exiting
Copy the code

Execution Result:

In the work... In the work... In the work... In the work... In the work... The main process is completeCopy the code

summary

  • Processes do not share global variables
  • The main process will wait for all child processes to finish executing the program before exiting

Interprocess communication -queue

The target

  • Know the put value of the message queue and get the worth operation

4.1 Use of Queue

We can use the Queue of the Multiprocessing module to transfer data between multiple processes. The Queue itself is a message Queue. First, we use a small example to demonstrate how the Queue works:

import multiprocessing
import time

if __name__ == '__main__':
    # create message queue, 3: indicates the maximum number of messages in the queue
    queue = multiprocessing.Queue(3)
    # Add data
    queue.put(1)
    queue.put("hello")
    queue.put([3.5])
    # Summary: Queues can fit into any data type
    If the queue is full, wait until there is free space in the queue before putting data into it. Otherwise, wait
    # queue. The put () (5, 6)
    If the queue is full, do not wait for a free position in the queue
    # queue. Put_nowait ((5, 6))
    Suggestion: Use put to queue

    Check if the queue is full
    # print(queue.full())

    Queue.empty () is unreliable for determining whether a queue is empty
    Check if the queue is empty
    # print(queue.empty())

    1. Add delay operation 2. Determine the number of queues instead of empty
    # time. Sleep (0.01)
    if queue.qsize() == 0:
        print("Queue empty")
    else:
        print("Queue is not empty")

    Get the number of queues
    size = queue.qsize()
    print(size)

    # Fetch data
    value = queue.get()
    print(value)
    Get the number of queues
    size = queue.qsize()
    print(size)
    # Fetch data
    value = queue.get()
    print(value)
    # Fetch data
    value = queue.get()
    print(value)

    Get the number of queues
    size = queue.qsize()
    print(size)

    If the queue is empty, the value of the queue must wait until the queue has a value
    # value = queue.get()
    # print(value)
    If the queue is empty, there is no need to wait for the queue to have a value, but if the queue is empty, it crashes
    Use get for queue values
    # value = queue.get_nowait()
    # print(value)
Copy the code

Running results:

The queue is not empty3
1
2
hello
[3.5]
0
Copy the code
instructions

When the Queue() object is initialized (for example, q=Queue()), if the parentheses do not specify the maximum number of messages that can be received, or if the number is negative, there is no upper limit (until the end of memory) on the number of messages that can be received;

  • Queue.qsize() : Returns the number of messages in the current Queue;
  • Queue.empty() : Return True if the Queue is empty, False otherwise, note that this operation is unreliable.
  • Queue.full() : Returns True if the Queue is full, False otherwise;
  • Queue.get([block[, timeout]]) : gets a message from the Queue and removes it from the Queue. Block defaults to True.

1) If the block uses the default value and no timeout (in seconds) is set, the queue will be empty until it reads the message from the queue. If timeout is set, the queue will wait for timeout seconds. If no message has been read, the queue will be blocked. Queue.Empty is raised;

2) If block is False, Queue will be immediately raised if Queue is Empty;

  • Queue.get_nowait() : equivalent to queue.get (False);
  • Queue.put(item,[block[, timeout]]) : writes the item message to the Queue. Block defaults to True.

1) If the block uses the default value and no timeout (in seconds) is set, if there is no space left to write, the program will block until there is space left from the queue. If timeout is set, the program will wait timeout seconds. If there is no space left, the program will wait for timeout seconds. The Queue.Full exception is raised;

2) If block is False, the Queue will immediately raise “queue.full” if there is no space to write to.

  • Queue.put_nowait(item) : equivalent to queue.put (item, False);

4.2 Message Queue This section describes how to test inter-process communication

Create two child processes in Queue, one to write data to Queue and one to read data from Queue:

import multiprocessing
import time


Write data
def write_data(queue) :
    for i in range(10) :if queue.full():
            print("The line is full.")
            break
        queue.put(i)
        time.sleep(0.2)
        print(i)


# fetch data
def read_data(queue) :
    while True:
        # add data from the queue, then out of the loop
        if queue.qsize() == 0:
            print("Queue empty")
            break
        value = queue.get()
        print(value)


if __name__ == '__main__':
    Create a message queue
    queue = multiprocessing.Queue(5)

    Create a process to write data to
    write_process = multiprocessing.Process(target=write_data, args=(queue,))
    Create a process that reads data
    read_process = multiprocessing.Process(target=read_data, args=(queue,))

    # start process
    write_process.start()
    The main process waits for the writing process to complete before continuing to execute
    write_process.join()
    read_process.start()
Copy the code

Running results:

0
1
2
3
4The queue is full0
1
2
3
4The queue is emptyCopy the code

summary

  • Get values from the queue and put values to the queue using the PUT method
  • Message queue It is unreliable to determine whether a queue is empty. You can use delay and number to determine whether a queue is empty

5. Process Pool Pool

The target

  • Multi-task using process pools

5.1 Concept of process Pools

Processes are stored in the process pool. The process pool automatically creates processes according to the task execution status, and creates fewer processes to properly utilize the processes in the process pool to complete multiple tasks

If a small number of child processes need to be created, you can use the Process of Multiprocessing to dynamically create multiple processes. However, if hundreds or even thousands of processes need to be created manually, you can use the Pool method provided by the Multiprocessing module.

When initializing the Pool, you can specify a maximum number of processes. When a new request is submitted to the Pool, if the Pool is not full, a new process is created to execute the request. But if the number of processes in the pool has reached the specified maximum, the request will wait until any processes in the pool end before the previous process is used to perform the new task.

5.2 Process Pool Synchronizes Tasks

Synchronously executing a task in the process pool indicates that a process in the process pool can execute a task only after its execution is complete. If the execution is not complete, the process waits for the previous process to execute the task

Process pool synchronization instance code

import multiprocessing
import time


Copy task
def work() :
    print("Copying...", multiprocessing.current_process().pid)
    time.sleep(0.5)

if __name__ == '__main__':
    Create a process pool
    # 3: The maximum number of processes in the process pool
    pool = multiprocessing.Pool(3)
    # Simulate a large number of tasks and let the process pool perform them
    for i in range(5) :# loop the process pool to perform the corresponding work task
        # Synchronous task execution. One task can be executed only after another task is completed
        pool.apply(work)
Copy the code

Running results:

Copy of...100512Copy of...68128Copy of...98924Copy of...100512Copy of...68128
Copy the code

5.3 Process Pool Asynchronously Executing Tasks

Asynchronous task execution in the process pool Indicates that the processes in the process pool execute tasks simultaneously without waiting

Process pool asynchronous instance code

# Process pool: processes in the pool. The process pool will automatically create processes according to the task execution status, and create as few processes as possible, and make reasonable use of processes in the process pool to complete multiple tasks
import multiprocessing
import time


Copy task
def work() :
    print("Copying...", multiprocessing.current_process().pid)
    Get the daemon status of the current process
    A Process created using Process pools is used to guard the state of the main Process
    # print(multiprocessing.current_process().daemon)
    time.sleep(0.5)

if __name__ == '__main__':
    Create a process pool
    # 3: The maximum number of processes in the process pool
    pool = multiprocessing.Pool(3)
    # Simulate a large number of tasks and let the process pool perform them
    for i in range(5) :# loop the process pool to perform the corresponding work task
        # Synchronous task execution. One task can be executed only after another task is completed
        # pool.apply(work)
        # asynchronous execution, task execution does not wait, multiple tasks are executed together
        pool.apply_async(work)

    No new tasks will be added to the main process
    pool.close()
    The main process waits for the process pool to complete before exiting
    pool.join()
Copy the code

Execution Result:

Copy of...122872Copy of...61772Copy of...114636Copy of...122872Copy of...114636
Copy the code

Summary:

Multiprocessing. Pool

  • Apply (func[, args[, KWDS]]): call the function in blocking mode. Args means to pass parameters to the function in tuple mode and KWDS means to pass parameters to the function in dictionary mode
  • Apply_async (func[, args[, KWDS]]) : call a function in non-blocking mode, args means passing arguments to a function as a tuple and KWDS means passing arguments to a function as a dictionary
  • Close () : Closes the Pool so that it can no longer accept new tasks.
  • Terminate () : Immediately terminate whether the task is complete or not;
  • Join () : the main process blocks, waiting for the child process to exit, must be used after close or terminate;

6. Process/thread comparison

The target

  • Know process and thread relationships and their pros and cons

6.1 Function Comparison

  • Process, can complete multi-task, such as in a computer can run multiple QQ at the same time
  • Threads can complete multiple tasks, such as multiple chat Windows in a QQ

6.2 Definition of Comparison

  • A process is a basic unit of system resource allocation. The operating system allocates resources to a process when it is started.
  • A thread is a branch of execution in a running program and is the basic unit of CPU scheduling.
  • Summary: Process is the basic unit of operating system resource allocation, thread is the basic unit of CPU scheduling

6.3 Relationship Comparison

  • Thread is attached to the process inside, there is no process there is no thread
  • A process provides one thread by default, and a process can create multiple threads

6.4 the difference between

  • Processes do not share global variables
  • Global variables are shared between threads, but be aware of the problem of competing resources. Solutions: mutex or thread synchronization
  • Creating a process is more expensive than creating a thread
  • Processes are the basic unit of operating system resource allocation and threads are the basic unit of CPU scheduling
  • Threads cannot execute independently and must depend on processes
  • Multi-process development is more stable than single-process multi-threaded development

The advantages and disadvantages

Multiple processes:

  • Advantages: Can use multi – core
  • Disadvantages: High resource overhead

Multithreading:

  • Advantages: Low resource overhead
  • Cons: Can’t use multiple cores

conclusion

The article is long, give a big thumbs up to those who see it! Due to the author’s limited level, the article will inevitably have mistakes, welcome friends feedback correction.

If you find this article helpful, please like, comment, and bookmark it

Your support is my biggest motivation!!