Python Subprocess: Run External Commands

Although there are many libraries on PyPI, sometimes you need to run an external command in Python code. The built-in Python subprocess module makes this relatively easy. In this article, you will learn some basics about processes and subprocesses.

We’ll use the Python subprocess module to safely execute external commands, take the output, and selectively provide them with input from standard input. If you are familiar with the theory of processes and subprocesses, you can skip the first part.

Processes and child processes

A program executed on a computer is also called a process. But what exactly is a process? Let’s define it more formally.

Process A process is an instance of a computer program executed by one or more threads.

A process can have multiple threads, which is called multithreading. Conversely, a computer can run multiple processes simultaneously. These processes can be different programs, but they can also be multiple instances of the same program. This is explained in great detail in our article on Python concurrency. The following image is also from that article:

If you want to run an external command, this means you need to create a new process from your Python process. Such processes are often referred to as child processes or sub-processes. Visually, this is what happens when one process produces two children:

What happens internally (inside the operating system kernel) is something called a fork. The process forks itself, meaning that a new copy of the process is created and started. This can be useful if you want to parallelize your code and take advantage of multiple cpus on your machine. This is what we call multiple processes.

However, we can use the same technique to start another process. First, the process forks itself, creating a copy. The copy replaces itself with another process: the one you want to execute.

We can do this in a low-level way using the Python subprocess module, but fortunately Python also provides a wrapper that takes care of all the details and is safe to do so. Thanks to the wrapper, you only need to call a function to run an external command. This wrapper is the function run() in the subprocess library, and that’s what we’ll use in this article.

I think it would be nice to let you know what’s going on internally, but if you’re confused, rest assured that you don’t need this knowledge to do what you want: run external commands with the Python subprocess module.

Create a Python subprocess using subprocess.run

Enough theory, now it’s time to start writing some code to execute external commands.

First, you need to import the Subprocess library. Since it is part of Python 3, you do not need to install it separately. In this library, we will use the run command. This command was added in Python 3.5. Make sure you have at least this Python version, but it’s best to run the latest version. If you need help, check out our detailed Python installation instructions.

Let’s start with a simple call to ls, listing the current directory and files:

>>> import subprocess
>>> subprocess.run(['ls'.'-al'])

(a list of your directories will be printed)
Copy the code

In fact, we can call Python binaries from our Python code. Let’s get the python 3 version installed by default on the system:

>>> import subprocess
>>> result = subprocess.run(['python3'.'--version'])
Python 3.8. 5
>>> result
CompletedProcess(args=['python3'.'--version'], returncode=0)
Copy the code

Line by line explanation:

  • We import the subprocess library
  • Run a subprocess, in this case python3 binary, with one parameter: –version
  • Look at the result variable, which is of type CompletedProcess

The process returns code 0, indicating that it executed successfully. Any other return code means there is some kind of error. This depends on the meaning of the different return codes defined by the process you are calling.

As you can see in the output, the Python binary prints its version number on standard output, which is usually your terminal. Your results may be different because your Version of Python may be different. Maybe you’ll even get an error that looks like this. FileNotFoundError: [Errno 2] No such file or directory: ‘PYTHon3’. In this case, make sure the Python binary for PYTHon3 is on your system and also in PATH.

Capture the output of the Python subprocess

If you run an external command, you will most likely want to capture the output of that command. We can do this with the capture_output=True option:

>>> import subprocess
>>> result = subprocess.run(['python3'.'--version'], capture_output=True, encoding='UTF-8')
>>> result
CompletedProcess(args=['python3'.'--version'], returncode=0, stdout='Python 3.8.5 \ n', stderr=' ')
Copy the code

As you can see, Python does not print its version to our terminal this time. The subprocess.run command redirects the standard output and standard error streams, so they can be captured and stored in result for us. Looking at the result variable, we see that the Python version was captured from the standard output. Because there are no errors, stderr is empty.

I also added the encoding=’UTF-8′ option. If you don’t, subprocess.run thinks the output is a byte stream because it doesn’t have this information. You can try it. As a result, stdout and stderr will be byte arrays. Therefore, if you know that the output will be ASCII text or UTF-8 text, you’d better specify it so that the function can encode the captured output accordingly.

Alternatively, you can use the option text=True without specifying encoding. Python captures the output as text. If you know the code, I recommend specifying it explicitly.

Input data from standard input

If an external command expects data on standard input, we can easily do so through the input option of Python’s subprocess.run function. Note that I will not discuss streaming data here. Here we will build on the previous example:

>>> import subprocess
>>> code = "" ".for i in range(1, 3):
.  print(f"Hello world {i}")
."" "

>>> result = subprocess.run(['python3'].input=code, capture_output=True, encoding='UTF-8')
>>> print(result.stdout)
>>> print(result.stdout)
Hello world 1
Hello world 2
Copy the code

We just use the Python3 binary to execute some Python code. Totally useless, but (hopefully) very instructive!

The code variable is a multi-line Python string that we assign as input to the subprocess.run command using the input option.

Running shell commands

If you want to execute shell commands on a Unix-like system, by which I mean any commands you would normally type in a bash-like shell, you need to be aware that these commands are usually not external binaries for execution. For example, expressions like for and while loops, or pipes and other operators, are interpreted by the shell itself.

Python often provides alternatives in the form of built-in libraries, which you should prefer. But if you need to execute a shell command, for whatever reason, subprocess.run will happily do so when you use the shell=True option. It allows you to type commands as if you were typing in a Bash compatible shell:

>>> import subprocess
>>> result = subprocess.run(['ls -al | head -n 1'], shell=True)
total 396
>>> result
CompletedProcess(args=['ls -al | head -n 1'], returncode=0)
Copy the code

There is a caveat: using this method is vulnerable to command injection attacks (see caution).

Points to note

Running external commands is not without risk. Please read this section very carefully.

os.system vs subprocess.run

You might see examples of code for os.system() to execute commands. However, the subprocess module is more powerful, and the official Python documentation recommends using it instead of os.system(). Another problem with OS.system is that it is easier to inject commands.

The command injection

A common attack or vulnerability is to inject additional commands to gain control of a computer system. For example, if you ask your user to input and call os.system() or subprocess.run(….) , shell=True), and you are vulnerable to command injection attacks.

For demonstration purposes, the following code allows us to run any shell command.

import subprocess
thedir = input()
result = subprocess.run([f'ls -al {thedir}'], shell=True)
Copy the code

Because we are using the user’s input directly, the user can run any command by simply following it with a semicolon. For example, the following input lists/directories and returns a text. Try it yourself.

/; echo "command injection worked!";
Copy the code

The solution is not to try to clean up user input. You might be tempted to start looking for semicolons and reject input when you find them. Don’t do it; Hackers can think of at least five other ways to append commands in this situation. It was an uphill battle.

A better solution is not to use shell=True, but to type commands in a list as we did in the previous example. Input like this will fail in this case because the subprocess module will determine that the input is a parameter to the program you are executing and not a new command.

Using the same input, but shell=False, you get the following result.

import subprocess
thedir = input(a)>>> result = subprocess.run([f'ls -al {thedir}'], shell=False)
Traceback (most recent call last):
  File "<stdin>", line 1.in <module>
  File "/ usr/lib/python3.8 subprocess. Py." ", line 489.in run
    with Popen(*popenargs, **kwargs) as process:
  File "/ usr/lib/python3.8 subprocess. Py." ", line 854.in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/ usr/lib/python3.8 subprocess. Py." ", line 1702.in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'ls -al /; echo "command injection worked!" ; '
Copy the code

This command is taken as an argument to ls, which tells us that it can’t find the file or directory.

User input is always dangerous

In fact, using user input is always dangerous, and not just because of command injection. For example, suppose you allow the user to enter a file name. After that, we read the file and display it to the user. While this may seem harmless, users can type something like this:… /… /… / the configuration Settings. Yaml.

Settings. yaml may contain your database password…… Oh dear! You always need to clean up and check user input properly. How to do this correctly, however, is beyond the scope of this article.

Continue to learn

The following related resources will help you delve more deeply into this topic:

  • All the details about the subprocess library are in the official documentation
  • Our article on Python concurrency explains more about processes and threads
  • Our section on using Unix shells may come in handy
  • Learn some basic Unix commands