This article is participating in Python Theme Month. See the link for details

Today we’ll talk about how to optimize your Python code, get your code to run Faster, and explain how to optimize your program by comparing different ways to do the same thing.

Easy to learn Python

One convenience of Python’s faster is reflected in the hands-on block, which is easy to get started. Therefore, many junior high schools now open Python programming language.

The reason python is so popular is that it is easy to use, such as Python’s Hello World and Java’s Hello World. When choosing a language for the first time, based only on Java and Python’s Hello World code below, I have a feeling that most people will choose something as simple as Python.

public class HelloWord{
    public static void main(String[] args){
        System.out.println("Hello, world!")}}Copy the code
print("hello world!")
Copy the code

Who is using Python

As you can see from the chart above which companies are using Python, it’s obvious that many of them are big companies like Google, Firefox and YaHoo, so there are thousands of programmers and a growing need to support Python every day.

What are the basic rules for optimizing code

Rule # 1 Good optimization is Don’t optimize

It may not be necessary to waste time on code optimization, as long as you buy fast enough hardware, which may be more expensive than spending time on code optimization.

Is the time ripe for optimization

Often when deciding whether to optimize, we need to ask ourselves whether the work is done, whether the functionality is implemented, and whether the unit tests cover testable code. It’s usually after these two things are done that we consider code optimizations. Then source level optimization, which is what we are going to share today. Next comes optimization at the build, compile, and run levels, where we need to understand the compiler or interpreter to optimize our code to compile and run.

Choose the right tools to measure whether our optimizations are working

Sometimes we optimize code, such as changing an algorithm that works, or writing a line or two less. So before we start tuning, we need to choose a tool that can measure how well our code is optimized.

  • cProfile
  • pstats
  • RunSnakeRun,SnakeViz

Level of optimization

Optimization is not limited to the level of code optimization, there are many levels of code optimization, such as design level optimization, software design level or architecture level optimization, which is time-consuming, so it is not necessary and few people will choose design level optimization. There is also algorithmic optimization, which is why many large factories need interviews.

  • Optimization of design level:
  • Algorithm and data structure level
sum = 0
N = 1 _000_000
for x in range(1,N+1) :sum += x
print(sum)
Copy the code
500000500000
Copy the code
# algorithm optimize
print(N * (1 + N)/2)
Copy the code
    500000500000.0
Copy the code
  • The source code level
  • Build level
  • Compile level
  • Runtime layer

Code optimization

Set up the environment

My environment is running on the Jupyter Notebook and checking the performance of the code with %timeit. Is a built-in widget in the Python standard library that can quickly test the performance of small pieces of code. The introduction of %timeit is a numerical illustration of which implementation is less time-consuming and more efficient

%timeit
Copy the code

Create a function ultimate_answer_to_life

def ultimate_answer_to_life() :
    return 42
Copy the code

When calling a function, simply place %timeit before the function.

%timeit ultimate_answer_to_life()
Copy the code
36.6 ns ± 0.297 ns per loop (mean ± std.dev. of 7 runs, 10000000 loops each)Copy the code
  • 1 second (s) = 1000 ms (ms)
  • 1 ms = 1000 microseconds (US)
  • 1 microsecond (US) = 1000 nanoseconds (ns)
  • 1 nanosecond (ns) = 1000 picoseconds (ps)

Before we start, we need to understand these units of time, and how they translate to each other.

Count the number of elements in the collection

how_many = 0
ONE_MILLION_ELEMENTS = range(1 _000_000)
Copy the code
def count_element_in_a_list(how_many) :
    ONE_MILLION_ELEMENTS = range(1 _000_000)
    for element in ONE_MILLION_ELEMENTS:
        how_many += 1
# print(how_many)
Copy the code

This method increments the how_many variable by one by looping through the collection, although it does the trick of counting the number of elements in the collection. But that’s a clumsy way to do statistics.

%timeit count_element_in_a_list(how_many)
Copy the code
28.8 ms ± 202 µs per loop (mean ± std.dev. Of 7 runs, 10 loops each)Copy the code

The next step is to hand over the task of counting the elements in the set to Len, a built-in Python function. Efficiency has been dramatically improved from ms to ns.

%timeit len(ONE_MILLION_ELEMENTS)
Copy the code
39.5 ns ± 0.258 ns per loop (mean ± std.dev. Of 7 runs, 10000000 loops each)Copy the code

Use built-in Python functions whenever possible, but not in all cases where built-in functions are the best choice. Take a look at the following examples.

Filter the collection

def filter_a_list() :
    output = []
    for element in ONE_MILLION_ELEMENTS:
        if element % 2:
            output.append(element)
Copy the code

Choosing all even numbers from a set,

%timeit filter_a_list()
Copy the code
482 ms ± 149 µs per loop (mean ± std.dev. Of 7 runs, 10 loops each)Copy the code

This time we use python’s built-in filter function to filter. In Python 2 it returns a list and in Python 3 it returns an iterator.

%timeit list(filter(lambda x: x % 2, ONE_MILLION_ELEMENTS))
Copy the code
62.1 ms ± 180 µs per loop (mean ± std.dev. Of 7 runs, 10 loops each)Copy the code

The filter function is slower than the for loop because the filter returns the value of the list function. The list function makes a small contribution to the elapsed time and reduces the performance a little bit. The best approach is the following one, which is theoretically 75% faster

%timeit [item for item in ONE_MILLION_ELEMENTS if item % 2 ]
Copy the code
34.6 ms ± 172 µs per loop (mean ± std.dev. Of 7 runs, 10 loops each)Copy the code

Permission 还是 forgiveness

There are two different ways to judge whether an object or dict has a property. If there is one, this will output it. It would have given what kind of Permission and forgiveness is. Using code to explain what is Permission and what is forgivess feels like a precaution and a remedy afterwards.

class Foo(object) :
    hello = 'world'
foo = Foo()
Copy the code
Permission
def hasattr_permissions_or_forgiveness() :
    if hasattr(foo, 'hello'):
        foo.hello
Copy the code
%timeit hasattr_permissions_or_forgiveness()
Copy the code
89.9 ns ± 0.229 ns per loop (mean ± std.dev. Of 7 runs, 10000000 loops each)Copy the code

Permission is the ability to correctly approve the property, determine if it exists in the class, and then perform operations on the property.

Forgiveness
def try_catch_permission_or_forgiveness() :
    try:
        foo.hello
    except AttributeError:
        pass
Copy the code
%timeit try_catch_permission_or_forgiveness()
Copy the code
56.4 ns ± 0.123 ns per loop (mean ± std.dev. Of 7 runs, 10000000 loops each) 56.4 ns ± 0.123 ns per loop (mean ± std.dev. Of 7 runs, 10000000 loops each)Copy the code

Forgiveness does not test for this property first. Instead, exceptions are thrown when errors occur with the property. This approach is superior to Permission.

Forgiveness is about twice faster than permission

def hasattr_permissions() :
    if (hasattr(foo,'foo') and hasattr(foo,'bar') and hasattr(foo,'baz')):
        foo.foo
        foo.bar
        foo.baz
Copy the code
%timeit hasattr_permissions()
Copy the code
75.7 ns ± 0.182 ns per loop (mean ± std.dev. Of 7 runs, 10000000 loops each)Copy the code
def try_catch_forgiveness() :
    try:
        foo.foo
        foo.bar
        foo.baz
    except AttributeError:
        pass
        
Copy the code
%timeit try_catch_forgiveness()
Copy the code
293 ns ± 1.88 ns per loop (mean ± std.dev. Of 7 runs, 1000000 loops each)Copy the code

When there are more category attributes to judge, the advantages of Forvieness become more obvious. However, this approach is suitable for when we have a high degree of confidence in this attribute of the category, while when we have a low degree of confidence, it is more efficient to have permission

Determines whether an element is a member of a collection

def check_number(number) :
    for element in ONE_MILLION_ELEMENTS:
        if element == number:
            return True
    return False
Copy the code
%timeit check_number(5000)
Copy the code
112 µs ± 75.3 ns per loop (mean ± std.dev. Of 7 runs, 10000 loops each)Copy the code
%timeit 5000 in ONE_MILLION_ELEMENTS
Copy the code
58.6 ns ± 0.255 ns per loop (mean ± std.dev. Of 7 runs, 10000000 loops each)Copy the code
%timeit 100 in ONE_MILLION_ELEMENTS
Copy the code
50.3 ns ± 0.136 ns per loop (mean ± std.dev. Of 7 runs, 10000000 loops each)Copy the code

Obviously, it’s more straightforward to use element in collections as a way of determining whether or not an element is in a collection, rather than going through every element, but it’s very efficient if you’re trying to figure out where elements are at the top of the collection, but if you’re trying to figure out where elements are at the end of the collection, It also takes a long time.

If we have some knowledge of data structures, we know that data structures like set are relatively fast to search, so we can define sets using data structures that are optimized for search, such as set or dict, so that the search time is not too long even if the element is located later.

In [1]: MILLION_NUMBERS = 1 _000_000


In [3]: MILLION_SET = set(range(MILLION_NUMBERS))

In [4]: %timeit 100 in  MILLION_SET
47.3Ns -3.38Ns per loop (mean ± std.dev.of7 runs, 10000000 loops each)

In [5]: %timeit 999999 in  MILLION_SET
62.9Ns -2.12Ns per loop (mean ± std.dev.of7 runs, 10000000 loops each)

Copy the code

List to heavy

def remove_duplicates() :
    unique = []
    for element in range(10000) :if element not in unique:
            unique.append(element)
Copy the code
%timeit remove_duplicates()
Copy the code
380 ms ± 5.14 ms per loop (mean ± std.dev. Of 7 runs, 1 loop each)Copy the code
%timeit set(range(10000))
Copy the code
187 µs ± 1.76 µs per loop (mean ± std.dev. Of 7 runs, 10000 loops each)Copy the code

For list de-duplication, a relatively simple and intuitive way is to use set to wrap the collection, which is simple and reliable.

List ordering

In [7]: %timeit sorted(range(MILLION_NUMBERS))
45.4Ms + / -2Ms per loop (mean ± std.dev.of7 runs, 10 loops each)

In [9]: %timeit list(range(MILLION_NUMBERS)).sort()
45.1Ms + / -2.8Ms per loop (mean ± std.dev.of7 runs, 10 loops each)
Copy the code

It’s not that different, but if you take the list out of the way, it’s clear that calling sort for a collection is much faster than wrapping it with sorted.

1000 operations are better than calling a function 1000 times

In [10] :def square(num) :. :return num**2. : In [11]: %timeit [square(i) for i in range(1000)]
437(including s -28.1µs per loop (mean ± std.dev.of7 runs, 1000 loops each)

In [12] :def compute_squares() :. :return [x**2 for x in range(1000)]
In [15]: %timeit compute_squares()
352(including s -20.1µs per loop (mean ± std.dev.of7 runs, 1000 loops each)

Copy the code

Calling a function multiple times to perform the same operation, instead of executing the function once, it is faster to perform multiple operations in this function.

Checks if a variable is True

if variable == True # 35.8 ns

if variable is True # 28.7 ns

if variable # 20.6 ns
Copy the code

Checks if a variable is False

if variable == False # 35.1 ns

if variable is False # 26.9 ns

if not variable # 19.8 ns

Copy the code

You’ve probably found that if variable and if not variable are the best ways to determine whether a variable is True or False, because it doesn’t make any extra judgment, it’s True when the variable is True or 1 or when there’s an array of elements. So if variable is the fastest.

Determine the length of a set

if list(a_list) == 0:# 91.7 ns

if a_list == []: # 56.3 ns

if not a_list: # 32.4 ns
Copy the code

Define a function

In [19] :def greet(name) :. :print(f"hi {name}")
    ...: 

In [20]: greet = lambda name: print(f"hi {name}")
Copy the code
  • usedefTo define a function and passlambdaTo define a function, assign the anonymous function to a variablegreetThis way. Both methods take the same amount of time to create functions, which means that the code is compiled in the same machine code. Lambda is still popular, and can be used as a one-line function to look neat, as a callback function, or as an argument.

Create lists and dictionaries

In [23]: %timeit list(a)119Ns -9.59Ns per loop (mean ± std.dev.of7 runs, 10000000 loops each)

In [24]: %timeit []
33.3Ns -6Ns per loop (mean ± std.dev.of7 runs, 10000000 loops each)
Copy the code

Because list() creates a list this way, [] creates a list much faster than list() because [] doesn’t have to parse functions, which is why [] is faster than list().

readability

Although we think about performance all the time in coding, we think about readability in addition to performance.

In [27]: a=1

In [28]: b=2

In [29]: c=3

In [30]: d=5

In [31]: e=6

Copy the code

In [32]: a,b,c,d,e = 1.2.3.5.6
Copy the code

Although the latter is more efficient, it is not as desirable as readability.


In [34] :def squares(num_list) :
    ...:     output = []
    ...:     for ele innum_list: ... : output.append(ele*ele) ... :returnoutput ... : In [35] :def squares_faster(num_list) :. : output = [] ... : append = output.append ... :for ele innum_list: ... : append(ele*ele) ... :returnoutput ... :Copy the code

Although the second method is faster, not so much because of readability, it is not as readable as the previous one because append is not easy to understand. How to write Faster Python code today is to get you started. I hope we can do more work on it. If you like this article, please like it and share it with us.