Some advice for Python friends | Python Theme month

This article is participating in Python Theme Month. See the link to the event for more details

Writing in the front

I’ve been working as a Python developer for several years, and I’ve always wanted to write a summary of Python, but I don’t know where to start. As the company’s projects continue to move to Java, I have a vague feeling that Python’s career is coming to an end, so I’ll take a moment to summarize some of Python’s work in this post.

The text start

Let’s look at a Python3 Cheat Sheet. I can’t remember where I saved it.

About version selection

As of July 7, 2021, the latest Python version is Python 3.9.6. It is recommended that you use the latest version of Python directly. If you have a Python project with a previous version, it’s not a big deal, except for a few syntax changes. As for getting started and learning Python, it is recommended to browse the official website.

Python’s official website
Official Documentation for Python 3.9

Of course, if you feel not in line with Chinese habits, then I recommend Liao Xuefeng Python tutorial.

It should be noted that on 1 January 2020 Python2 updates were officially discontinued.

About Pythonic

Used to praise Python-style code, that is, code that takes full advantage of the features of the Python language to write clean, readable, and often fast code. It also means that the API conforms to the programming style of a Python pro.

Pythonic is how Python developers describe code that conforms to a particular style. This Pythonic style is neither a very strict specification nor a rule imposed by the compiler on the developer. It’s something that people have developed over the course of working together in The Python language and Python developers don’t like complex things, they prefer intuitive, concise, readable code.

About the Ipython

IPython is an interactive interpreter based on Python. IPython provides more powerful editing and interaction capabilities than the native Python Shell.

You are advised to use Ipython when learning to test Python code. Can quickly verify the results and the realization of ideas, Jupyter is based on Ipython development of a Web end interactive development tools.

Ipython website
Jupyter website

If you’re a Web developer, Ipython can meet everyday needs; If you’re a data scientist, use Jupyter directly, most data scientists use Jupyter as their preferred development tool.

On the list/dictionary/set derivation

Skilled in using various derivations, you can solve most map/ Reduce/Filter problems.

# Filter numbers from 0 to 9 that are divisible by 3
[ i for i in range(10) if i % 3= =0] # [0, 3, 6, 9]

my_list = ['apple'.'banana'.'grapes'.'pear']

Quickly generate a dictionary with subscript + elements
{ k:v for k,v in enumerate(my_list)} # {0: 'apple', 1: 'banana', 2: 'grapes', 3: 'pear'}

Quickly retrieve the length of the string in the list
{len(s) for s in ['a'.'is'.'with'.'if'.'file'.'exception']} # {1, 2, 4, 9}

# Generator comprehensions are used in the same way as list comprehensions, except that generator comprehensions return a generator object
(i for i in range(10))  # <generator object <genexpr> at 0x103d808e0>

Copy the code

For sequences with a large amount of data, it is recommended to use generator derivation instead of list derivation. Use generator inference lazy loading feature to process big data.

Now, generators are essentially iterators, but they’re special. Lists are all numbers stored in an object at once, when the generator needs them, and when they are generated.

a = [i for i in range(10)] # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b = (i for i in range(10)) # <generator object <genexpr> at 0x102a22888>

When retrieving elements, the generator does not support fetching the subscript directly. You need to call the next function to fetch the subscript in sequence

# Generator gets data
next(b)
Copy the code

PS: The list, generator, and other iterators are accessed inside the for loop, so they behave the same in the for loop.

About getting an element index in a loop

It is highly recommended to use Enumerate, which is built into Python to wrap various iterators into generators that return element subscripts + element objects on each call.

my_list = ['apple'.'banana'.'grapes'.'pear']

# is not recommended
for i in range(len(my_list)):
   print(i, my_list[i])

# recommended
for counter, value in enumerate(my_list):
    print(counter, value)
Copy the code

About dictionary merging

The expectation is that two dictionaries exist, merge and generate a new dictionary

支那The way parameters are deconstructed
The dictionaryupdatemethods

a_dict = {"a":1."b":1}
b_dict = {"b":2."c":2}

# Parameter deconstruction
c_dict = {**a_dict, **b_dict}

The dictionary update method
c_dict = {}
c_dict.update(a_dict)
c_dict.update(b_dict)

# Other ways
import itertools
c_dict = dict(itertools.chain(a_dict.items(), b_dict.items()))

# Appeal the merged dictionary. Duplicate keys will override the former
The following way is to use the former first
from collections import ChainMap
c_dict = ChainMap(a_dict, b_dict
Copy the code

Note: a ChainMap takes multiple dictionaries and logically transforms them into a single dictionary. However, the dictionaries aren’t really merged; the ChainMap class just creates an internal list of the dictionaries and redefines some common dictionary operations to traverse the list.

About closures/decorators

An interesting closure that implements function addition is called add(1)(2) and keeps track of how many times the function is called (as asked in the interview). Note that:

The inner function wants to use the local variable of the outer functionnonlocalThe statement.

def add(a) :
    count = 1

    def fun(b) :
        nonlocal count
        print(f"func called count {count}")
        count += 1
        return a + b
    return fun

add(1) (2) # 3
Copy the code

To implement a decorator that supports parameter logging, note the following:

Internal functions need to be@wraps(func)Modified to preserve information about the original function.

from functools import wraps, partial
import logging

def logged(func=None, *, level=logging.DEBUG, name=None, message=None) :
    if func is None:
        return partial(logged, level=level, name=name, message=message)

    logname = name if name else func.__module__
    log = logging.getLogger(logname)
    logmsg = message if message else func.__name__

    @wraps(func)
    def wrapper(*args, **kwargs) :
        log.log(level, logmsg)
        return func(*args, **kwargs)

    return wrapper

# Example use
@logged
def add(x, y) :
    return x + y

@logged(level=logging.CRITICAL, name='example')
def spam() :
    print('Spam! ')
Copy the code

About reading large files

A classic interview question would look at reading files in Python, such as this:

How to read a file size of 8 GB on a computer with less than 8 GB of memory?

For details of option 1, see Stackoverflow:

Manually implement iterators for file reading and specify the size of each file iteration

def read_in_chunks(file_object, chunk_size=1024) :
    """Lazy function (generator) to read a file piece by piece. Default chunk size: 1k."""
    while True:
        data = file_object.read(chunk_size)
        if not data:
            break
        yield data


with open('really_big_file.dat'."rb") as f:
    for piece in read_in_chunks(f):
        # <do something with piece>
        process(piece)
Copy the code

For details of option 2, see Stackoverflow:

Using the with statement block, the file will automatically handle the opening, closing, exception; For line in f is an iterator that automatically uses buffered IO and memory management, so you don’t have to worry about large files.


with open("file_name"."rb") as f:
    for line in f:
        # <do something with line>
        process(line) 
Copy the code

Summary: Both methods use iterators, lazy load way to block processing file ps: binary mode to open the file, read faster.

About some standard libraries

Here are some of the libraries that I think are interesting, and you can check the source code for more information.

Collections provide additional data types:
1. Namedtuple generates a tuple subclass that can use its name to access the contents of an element
2. Deque a double-ended queue that speeds up appending and deriving objects from the other side
3. Counter a counter used mainly for counting
4. OrderedDict Ordered dictionary
5. Defaultdict A dictionary with default values

from collections import * 
# orderedDict # orderedDict # orderedDict # orderedDict # orderedDict # orderedDict
o = OrderedDict()
o["x"] = 1
print(o) # OrderedDict([('x', 1)])

Name the element of a tuple
rectangle = namedtuple("rectangle"["a"."b"])
r1 = rectangle(3.4)
print(r1) # rectangle(a=3, b=4)
print(r1.a) # 3

#...

# Counter is useful for counting list elements; Similar functions have word frequency statistics
wc = Counter("hello world")
print(wc) # Counter({'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1})
print(wc["h"]) # 1
Copy the code

heapqThe heap queue implemented by Python

Find the first K elements of an array. This can be done in two lines of heAPQ
import heapq
nums = [14.20.5.28.1.21.6.12.27.19]
# First K elements
heapq.nlargest(3, nums) # [28, 27, 21]
heapq.nsmallest(3, nums) # [1, 5, 6]
Copy the code

Recommendations about some books

Since some of the books are not publicly available in electronic form, there are no links here.

Python3-Cookbook
Smooth the Python
The Python 3 standard library
59 effective ways to Write high-quality Python code

One more word, read the official website documentation.

Write in the last

Let’s finish with a Python poem.

import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

# Corresponding Chinese translation
The Zen of Python by Tim Peters Beautiful is better than ugly Clear is better than obscure Simple is better than complex is better than messy Flat is better than nested spacing is better than compact readability Important Special cases are not special enough to violate these principles don't ignore errors unless the program needs to do so in the face of ambiguity, Declined to speculate on the most direct method to solve the problem, there should be a best only one possible this way is not directly at the beginning, because you are not a fan Luo Sum maybe better than not to be, but don't want to do it is better to do If the solution is hard to describe understand, so must be a bad plan If easy to describe, So that might be a good solution namespace is a great idea, make use of ""
Copy the code