Make Python run faster

Click on the asynchronous book, the top public account

Every day to share with you IT good books technology dry goods workplace knowledge

Tips Participate in the topic discussion at the end of the article, i.e. have a chance to get an asynchronous book.

Python is easy to learn. You’re probably reading this article because your code now works correctly and you want it to go faster. You’re happy with the fact that you can easily change the code and implement your ideas over and over again. But the trade-off between being easy to implement and running code fast enough is a well-known and regrettable phenomenon. And that’s a problem that can be solved.

Some people want to make sequential execution faster. Some people need to take advantage of multi-core architectures, clusters, or graphics processing units to solve their problems. Some people need scalable systems that can handle more or less work, depending on how much money they have or how reliable they are. Some people realize that their programming skills, often from other languages, may not be as natural as others.

We’ll cover all of these topics in this article, giving sensible guidance to understand bottlenecks and propose more efficient and scalable solutions. We’ll also include battlefield stories from those who came before you so you don’t have to repeat them.

Python is well suited for rapid development, production deployment, and scalable systems. The Python ecosystem is full of people who can help you scale, freeing up more time for more challenging work.

Python 2.7

Python 2.7 is the dominant version of Python in scientific and engineering computing. Under * NIx environments (usually Linux or Mac), 64-bit versions dominate. 64-bit gives you a much wider range of RAM addressing. * NIx allows you to build applications whose behavior, deployment, and configuration methods are easily understood by others.

If you’re a Windows user, buckle up. Most of the code we’ve shown will work, but some things are specific to the operating system, and you’ll have to explore solutions under Windows. The biggest difficulty Windows users can face is module installation: searching sites like StackOverflow should help you find the answers you need. If you’re using Windows, using a Virtual machine with Linux installed (such as VirtualBox) may help you experiment more freely.

Windows users should definitely check out the packaged solutions offered through Python distributions like Anaconda, Canopy, Python(X, Y), or Sage. These distributions will also make life a lot easier for Linux and Mac users.

Migrate to Python 3

Python 3 is the future of Python, and everyone is migrating to the past. Although Python 2.7 will be with us for many years to come (some installments still use Python 2.4 from 2004), its retirement date has been set for 2020.

Upgrading to Python 3.3+ has been a headache for Python library developers, and people have been slow to port code (for good reason), so people have been slow to switch to Python 3. This is mainly because it is too complicated to switch an application that uses Python 2’s String and Unicode data types to Python 3’s Unicode and Byte.

In general, when you need to reproduce results based on a set of trusted libraries, you don’t want to be on the cutting edge of dangerous technology. Developers of high-performance Python are more likely to use and trust Python 2.7 in the coming years.

Most of the code in this article runs in Python 3.3+ with only minor modifications (the most obvious change is that print changes from a statement to a function). In some places, we’ll focus specifically on Python 3.3+ for performance improvements. One area you might want to focus on is the division of/for INTEGER in Python 2.7, which in Python 3 becomes the division of float. Of course, as a good developer, your well-written unit tests should already be testing your critical code paths, so if your code needs to pay attention to this, you should already be receiving warnings from your unit tests.

Scipy and Numpy have been compatible with Python 3 since 2010. Matplotlib has been compatible since 2012, SciKit-Learn in 2013, NLTK in 2014, and Django in 2013. Migration memos for these libraries can be viewed in their respective code bases and newsgroups. If you also have older code that needs to be ported to Python 3, it’s worth reviewing these library migrations.

We encourage you to work on new projects with Python 3.3+, but be wary of libraries that have recently been ported and have few users — tracking bugs will be more difficult. It’s wise to make your code compatible with Python 3.3+ (learn about the __future__ module imports) so future upgrades will be easier.

There are two good reference manuals: Porting Python 2 Code to Python 3 and Porting To Python 3: An In-depth Guide. Distributions like Anaconda or Canopy let you run Python 2 and Python 3 at the same time — this should make your migration a little easier.

After reading this article you will be able to answer the following questions

What are the elements of computer architecture?

What are the common computer architectures?

What is the abstract representation of computer architecture in Python?

What are the barriers to implementing high-performance Python code?

What are the types of performance problems?

Computer programming can be thought of as moving and transforming data in a particular way to get a result. However, these operations have a time cost. Thus, high-performance programming can be thought of as minimizing the cost of those operations by reducing overhead (such as writing more efficient code) or changing the way they are done (such as finding a more suitable algorithm).

The movement of data takes place on the actual hardware, and we can learn more about the hardware details by reducing the code overhead. This exercise may seem trivial, because Python does a lot to abstract away our direct manipulation of the hardware. However, by understanding how data moves at the hardware level and how Python moves data at the abstract level, you’ll learn something about writing high-performance Python programs.

1.1 Basic computer systems

The underlying components of a computer can be divided into three basic parts: the computing unit, the storage unit, and the connections between the two. In addition, these units have a variety of properties that help us understand them. Cell has an attribute tells us it can be calculated how many times per second, storage unit has an attribute tells us how much it can save the data, and an attribute tells us to how fast to read and write it, and while there is a connection attribute told us that they can to how fast the move data from one place to another place.

With these basic units, we can describe a standard workstation at a variety of complexity levels. For example, a standard workstation can be thought of as having a central processing unit (CPU) as its computing unit, two independent storage units, random-access memory (RAM) and hard disk (each with different capacities and read/write speeds), and finally a bus connecting all this together. However, we can also drill down into the CPU and discover that there are multiple storage units inside: L1, L2, and sometimes even L3 and L4 caches, which are small (from a few kilocytes to tens of Megabytes) but very fast. These additional storage units connect to the CPU through a special bus called a back-end bus. In addition, new computer architectures often have new configurations (for example, Intel’s Nehalem architecture replaced the front-end bus with Intel’s fast lane interconnection technology and rebuilt many connections). Finally, in the discussion of the above case, we have also neglected network connections, which are slow connections to many other potential cells and storage units.

To help clarify these intricacies, let’s go through a brief description of the basic units.

1.1.1 Computing unit

The computing unit of a computer is its central component-it has the ability to convert any input it receives into output and to change the current processing state. The CPU is the most common computing unit; However, graphics processing units (Gpus), which were originally designed to speed up computer image processing, are now becoming more suitable for numerical calculations because their own parallel mode allows large numbers of calculations to be performed concurrently. Either way, a cell receives a series of bits (such as those representing numbers) and outputs another set of bits (such as those representing the sum of those numbers). In addition to the basic arithmetic operations of real numbers and binary bit operations, some cells provide very special operations, such as “multiply and add mixed computation,” which takes three digits A, B, and C and returns the value of A * B + C.

The main attributes of a cell are the number of operations it can perform per cycle and the number of cycles it can complete per second. The first attribute is measured by the number of instructions completed per cycle (IPC) [1], while the second attribute is measured by its clock speed. When new units are made, these two parameters always compete with each other. For example, Intel’s Core family has very high IPC but low clock speed, while the Pentium 4 chips have the opposite. Then again, the IPC and clock speeds of gpus are high, but they have other issues that we’ll cover later.

In addition, when the clock speed increases, it immediately increases the speed of all programs on the cell (because they can perform more operations per second), while increasing IPC has a significant impact on vector computing power. Vector computing refers to providing more than one piece of data to a CPU at a time that can be manipulated simultaneously. This type of CPU instruction is called SIMD (Single instruction multiple data).

Overall, the cell has made limited progress over the past decade (Figure 1-1). Improvements in clock speed and IPC came to a standstill as transistors became as small as they could physically be. As a result, chipmakers are relying on other means to achieve higher speeds, including hyper-threading, smarter out-of-order execution and multicore architectures.

remove

Click here to add caption

Figure 1-1 CLOCK speed change with time (data from THE CPU DB)

Hyperthreading virtualizes a second CPU for the host’s operating system (OS), while clever hardware logic tries to interleave two instruction threads into the execution unit of a single CPU. If successfully inserted, it can be 30% better than a single thread. In general, this works well when the work of two threads is distributed in different units of execution — such as when one operates on floating points and the other on integers.

Out-of-order execution allows the compiler to detect that parts of a linear program can be independent of previous work, meaning that two jobs can be executed in various orders or simultaneously. As long as the results of two works can be obtained at the correct point in time, the program can continue to run correctly even if their calculation order is different from that of the program design. This allows some instructions to be executed when they are blocked (for example, waiting for a memory access) to increase resource utilization.

Last but not least for advanced programmers is the popularity of multi-core architectures. These architectures include multiple cpus in the same cell, increasing overall computing power and allowing individual cores to run faster without waiting for memory barriers. That’s why it’s hard to find a computer with less than two cores — meaning it has two interconnected physical units of computation. While this increases the total number of operations that can be performed per second, there are a number of complications that need to be considered to maximize utilization of both cells.

Adding more cores to the CPU does not necessarily make your program run faster. This is determined by Amdar’s law. Simply put, Amdar’s law states that if a program that can run on multiple cores has certain execution paths that must run on a single core, then those paths become bottlenecks and ultimately cannot be improved by adding more cores.

For example, if we have a survey that requires 100 participants and takes 1 minute, we can complete the task in 100 minutes if we have only one questioner (who asks the first participant, waits for an answer, and then moves on to the next participant). The process of asking questions and waiting for answers is a sequential process. For a sequential process, we can only do one operation at a time, and subsequent operations must wait for previous operations to complete.

However, if we had two questioners, they could take the test at the same time and complete the task in 50 minutes. This is because the two questioners do not need to know each other, there is no dependency, so the whole task is easily divided.

Adding more questioners speeds things up even further until we have 100 questioners. At this point, the whole process takes only 1 minute to complete, depending on how long it takes the participant to answer the question. Adding more questioners won’t bring any speed up because there’s nothing to do with the extra questioners — all the participants are already being surveyed! At this point, the only way to reduce the overall time is to reduce the time it takes individual participants to complete the survey, which is the execution time required for the sequential portion. Similarly, with the CPU, we can add more cores until a task that must be executed in a single core becomes a bottleneck. In other words, the bottleneck of any parallel computing will ultimately fall on the part of the task that it executes in sequence.

In addition, the main obstacle to taking full advantage of multi-core capabilities for Python is Python’s global interpreter lock (GIL). GIL ensures that Python processes can only execute one instruction at a time, no matter how many cores are currently in place. This means that even though some Python code can use multiple cores, only one core is executing Python instructions at any point in time. In the example of the previous survey, even though we have 100 questioners, only one at a time can ask questions and receive answers, which is not helpful! That seems like a serious handicap, especially when the trend in computing is to have more units, not faster ones. Fortunately, there are ways to avoid this problem, such as the multiprocessing of the standard library, or techniques such as Numexpr, Cython, or distributed computing models.

1.1.2 Storage Unit

The memory unit of a computer is used to hold bits. These bits might represent variables in a program, or pixels in an image. The memory unit concept includes registers, RAM, and hard disks on the motherboard. The main difference between all these different types of storage is the speed at which they can read and write data. To complicate matters further, the speed at which data is read and written also depends on how the data is read and written.

For example, most storage units perform much better when they read one large chunk of data at a time than when they read multiple small chunks of data (sequential vs. random). Think of the data in these storage units as pages in a book, and most of them can read and write faster in consecutive page turns than when they often jump from one random page to another.

All storage units are affected to some extent, but different types of storage units are affected differently.

In addition to read/write speed, the storage unit also has a latency attribute, which indicates how long it takes the device to find the data it needs. The latency of a spinning hard disk can be high because the disk must be physically rotated to a certain speed and the read head must be moved to the correct position. RAM, on the other hand, has less latency because everything is solid. The following is a brief description of the types of storage units that are common on a standard workstation, in order of read/write speed:

Rotating disk

Computer shutdown can also maintain long-term storage. Read and write speeds are usually slow because the disk must physically rotate and wait for the head to move. Random access performance is degraded but capacity is high (terabyte level).

Solid state drives

Similar to a rotating hard disk, the read/write speed is fast but the capacity is small (GB level).

RAM

Used to store application code and data (such as variables used). Has faster read and write speeds and good performance for random access, but is usually limited by capacity (GB level).

L1 / L2 cache

Extremely fast read and write speed. Data going into the CPU must pass through here. Small capacity (KB level).

Figure 1-2 shows the differences between the types of storage units available on the market today.

One clear trend is that read and write speed is inversely proportional to capacity — as we try to speed things up, capacity goes down. As a result, many systems implement a tiered storage: data starts out on hard disk, some goes into RAM, and a small subset goes into L1/L2 caches. This layering allows programs to store data in different places depending on access speed requirements. When trying to optimize the program’s storage access pattern, we simply optimized where the data is stored, the layout (to increase the number of sequential reads), and the number of times the data is moved between different locations. In addition, technologies such as asynchronous I/O and cache prefetch provide many ways to ensure that data is in place when it is needed without wasting extra computation time — these processes can be done independently of other calculations!

remove

Click here to add caption

Figure 1-2 Features of each storage unit (Data in February 2014)

1.1.3 communication layer

Finally, let’s look at how these basic units talk to each other. There are many modes of communication, but they are all variations on the same thing: the bus.

For example, the front-end bus is the connection between RAM and L1/L2 cache. It moves the data that is ready to be manipulated by the processor into a staging area for computing and moves the results out. In addition, there are other buses, such as the external bus is the main trunk of hardware devices (such as hard disks and network cards) to the CPU and system memory. This bus is usually slower than the front-end bus.

In fact, many of the benefits of L1/L2 caching actually come from the faster bus. Because you can accumulate large chunks of data that need to be computed on the slow bus (connecting RAM and cache) and pass it from the back end bus (connecting cache and CPU) to the CPU very quickly, the CPU can do more computations without waiting so long.

Again, many of the disadvantages of using a GPU come from the bus to which it is connected: because the GPU is typically an external device, it communicates over a PCI bus, which is much slower than a front-end bus. As a result, the input and output of GPU data is like a tax operation. The rise of heterogeneous architectures, computer architectures with both cpus and Gpus on the front-end bus, is aimed at reducing data transmission costs and enabling gpus to be used for computations requiring large amounts of data transmission.

In addition to the communication module inside the computer, the network can be thought of as another communication module. However, this module is more flexible than previously discussed, and a network device can be directly connected to a storage device, such as a network Connected storage (NAS) device or another computer node in a computer cluster. But network communication is generally much slower than the other types of communication discussed earlier. The front-end bus can transfer tens of gigabytes per second, while the network can transfer only tens of Megabytes.

It is now clear that the main attribute of a bus is its speed: how much data it can transfer in a given amount of time. This property is determined by two factors: how much data can be transmitted at once (bus bandwidth) and how many times per second (bus frequency). It is important to note that a transfer of data is always in order: a piece of data is read out of memory and then moved to another location. This is why the speed of the bus can be broken up into two factors, because these two factors independently affect different aspects of computing: high bus bandwidth can be moving all relevant data, once help vectorized code (or any order read memory code), on the other hand, low bandwidth and high frequency will help those who often random access memory code. Interestingly, these properties are determined by the computer designer’s physical layout of the motherboard: when chips are close to each other, the physical links between them are shorter, allowing for higher transmission speeds. The number of physical links determines the bandwidth of the bus (bandwidth really is a physical word!). .

Since physical interfaces can be optimized for a particular application, it’s not surprising that there are hundreds of different types of connections. Figure 1-3 shows the bit rates of some common interfaces. Note that there is no mention at all of connection latency, which determines how long it takes a connection to respond to data requests (although latency is specific to a specific computer system, there are some basic limitations from the physical interface).

Figure 1-3 Connection speeds of common interfaces (from Leadbuffalo)

1.2 Assemble the basic elements together

Understanding the basic building blocks of a computer is not enough to understand the problems of high-performance programming. The interaction and collaboration of all these components also introduces new complexity. This section looks at sample problems, describes the ideal solutions, and how Python implements them.

Warning: This paragraph may seem desperate – most of the problems seem to prove that Python is ill-suited to solving performance problems. This is not true for two reasons. First, in all this “HPC stuff,” we’re missing one crucial element: developers. Features that native Python lacks in performance will be developed quickly. In addition, we will introduce various modules and principles in this article to help mitigate the problems encountered here. When these two things are combined, we can quickly develop Python while removing many of the performance limitations.

Ideal computing model and Python virtual machine

To better understand the elements of high-performance programming, let’s take a look at a simple code example for determining prime numbers:

Let’s use an abstract computational model to analyze what actually happens when this code is run in Python. Because of the abstract model, we will ignore many of the details of the ideal computer and the way Python runs code. However, this is a good exercise before tackling a real problem: thinking about common components in an algorithm and how to best use them to solve a problem. By understanding what happens in Python ideally and in practice, we can bring our Python code closer to optimal.

1. Ideal computing model

At the beginning of the code, we store the value of number in RAM. To calculate sqrt_number and number_float, we need to pass that value into the CPU. Ideally, we would only need to pass once, it would be stored in the CPU’s L1/L2 cache, and then the CPU would do two calculations and send the results back to RAM for storage. This is an ideal situation because we keep the number values read from RAM to a minimum and instead use much faster L1/L2 cache reads. In addition, we kept the front-end bus to a minimum and replaced it with a faster back-end bus (connecting the CPU and various caches). The scenario of keeping data where it is needed and moving it as little as possible is critical for optimization. The concept of “heavy data” refers to the time it takes to move data around, and that’s what we need to avoid.

In the loop part of the code, rather than entering I into the CPU again and again, we want to check by entering number_float and multiple I values into the CPU at once. This is possible because the CPU’s vector operations require no additional time cost, meaning that it can perform multiple independent calculations at once. So we want to pass number_float into the CPU cache and as many values of I as we can cache. For each pair of number_float/ I, we do a division calculation and check if the result is an integer, then send back a signal indicating if any of the results are indeed integers. If so, the function ends. If not, we proceed to the next batch of calculations. This way, for multiple I values, we only need to return one result, rather than relying on the bus to return all the values. This takes advantage of the CPU’s vectorization capabilities, or the ability to manipulate multiple pieces of data in a single instruction in a single clock cycle.

The concept of vector manipulation can be expressed in the following code:

Here, we have the program do division and integer checks on the values of five I’s at once. When vectorized correctly, the CPU needs only one instruction to complete the line of code rather than operating independently on each I. Ideally, the any(result) operation would only happen inside the CPU without passing data back to RAM.

2. Python VM

The Python interpreter does a lot of work to isolate the underlying computational elements. This frees the programmer from thinking about how to allocate memory for arrays, how to organize memory, and in what order to pass it to the CPU. This is one of Python’s strengths, allowing you to focus on the implementation of the algorithm. However, it comes at a huge performance cost.

The first thing to realize is that the Python core runs on a very optimized set of instructions. The trick is to get Python to execute them in the right order for better performance. For example, we can easily see that although both algorithms have O(n) running time, search_fast will be faster than search_slow because it aborts the loop prematurely, skipping unnecessary calculations.

Looking for slow areas of code through performance analysis and looking for more efficient algorithms is essentially looking for these useless operations and removing them. The end result is the same, but the number of computations and transfers of data is significantly reduced.

One of the effects of the Python VIRTUAL machine abstraction layer is that vector manipulation is not directly available. Our original prime function would loop over every value of I instead of combining multiple traversals into a single vector operation. The vectorized code we abstracted is not legal Python code, because we cannot remove a floating point with a list. External libraries such as Numpy can help us solve this problem by adding vectorized mathematical operations.

In addition, Python abstractions affect any optimizations that need to save relevant data in the L1/L2 cache for the next computation. There are many reasons for this, starting with the fact that Python objects are no longer optimally laid out in memory. This is because Python is a garbage collection language — memory is automatically allocated and freed as needed. This causes memory fragmentation and affects transfers to the CPU cache. In addition, we don’t have any opportunity to directly change the in-memory layout of data structures, which means that a transfer on the bus may not contain all the relevant information for a calculation, even if it is less than the bus bandwidth.

The second problem is more fundamental, rooted in Python’s dynamic typing and the fact that Python is not a compiled language. As many C developers have realized over the years, compilers are always smarter than you. When compiling static code, the compiler can do a number of things to change the memory layout of objects and make the CPU run certain instructions to optimize them. However, Python is not a compiled language: worse, it also has dynamic typing, meaning that any algorithmic optimization opportunities are more difficult to achieve because the functionality of the code can be changed at run time. There are a number of ways to mitigate this problem, the main one being the use of Cython, which compiles Python code and allows the user to “hint” how “dynamic” the compiler code really is.

Finally, the GIL mentioned earlier affects the performance of parallel code. For example, suppose we change our code to use multiple CPU cores, each receiving a bunch of numbers ranging from 2 to sqrtN. Each core can perform calculations on the data it receives and compare them with each other when they are all complete. This seems like a good scenario, although we lose the ability to abort loops early, the number of checks per core decreases as the number of cores we use increases (for example, if we have M cores, each core only needs sqrtN/M checks). However, due to GIL, only one core can be used at a time. This means we’re still running the code in a non-parallel fashion, and we can’t abort it early. We can avoid this problem by using the multiprocessing module instead of multithreading, or by using Cython or external functions.

1.3 Why Python

Python is highly expressive and easy to use — new developers will quickly discover that they can do a lot in a short time. Many Python libraries contain tools written in other languages that make it easy for Python to call other systems. For example, the SciKit-Learn machine learning system includes LIBLINEAR and LIBSVM (both written in C), while the Numpy library includes BLAS and other libraries written in C and Fortran. Therefore, Python code that uses these libraries correctly can indeed be as fast as C.

Python is known as “battery packed” because it has many important and stable libraries built into it. Include:

Unicode and bytes

Language core built in.

array

An efficient array of primitive type.

math

Basic mathematical operations, including some simple statistical mathematics.

sqlite3

Contains the popular file-based SQL engine SQLite3.

collections

Multiple objects, including variants of bidirectional queues, counters, and dictionaries.

In addition to these language core libraries, there are a number of external libraries, including:

numpy

A Python number library (the cornerstone library of matrix arithmetic).

scipy

A large collection of trusted scientific libraries, often including the widely respected C and Fortran libraries.

pandas

A data analysis library, similar to R language data box or Excel spreadsheet, based on SCIpy and Numpy.

scikit-learn

Is fast becoming the default machine learning library, based on SCIpy.

biopython

A bioinformatics library, similar to Bioperl.

tornado

A library that provides a concurrency mechanism.

Various types of database encapsulation

To communicate with almost all databases, including Redis, MongoDB, HDF5, and SQL.

Various web development frameworks

Various high-performance systems for creating websites, such as Django, PYRAMID, Flask, and Tornado.

OpenCV

Computer vision packaging.

Various API packages

For easy access to a variety of trendy Web apis like Google, Twitter, and LinkedIn.

To accommodate a variety of deployment environments, there are a number of management environments and shells to choose from, including:

Standard release.

Enthought’s EPD and Canopy is a very mature and capable environment.

Continuum’s Anaconda, an environment that focuses on scientific computing.

Sage, a Matlab-like environment, includes an integrated development environment (IDE).

Python (x, y).

IPython, a Python interactive shell widely used by scientists and developers.

IPython Notebook, a browser-based IPython front end, is widely used for teaching and demonstrations.

BPython, another Python interactive shell.

One advantage of Python is that it can quickly prototype a new idea. With all the supporting libraries, it’s easy to test whether an idea works, even if the first implementation is a bumbling one.

If you want to make your math functions faster, take a look at Numpy. If you want to experiment with machine learning, try SciKit-Learn. Pandas is a good choice if you are cleaning and manipulating data.

Overall, it’s reasonable to ask, “Could optimization to make our system run faster actually cause our team to run slower as a whole in the long run?” Systems can always get more performance out of them with enough manpower, but this can lead to fragile maintainability, unintelligible optimizations and ultimately a team trip to the ground.

An example is Cython (Section 7.6), which comments Python code to a C-like type that can be compiled by a C compiler. The increase in speed is impressive (the speed of C can be achieved with relatively little effort), but the cost of maintaining subsequent code also goes up. In particular, this new module may be more difficult, as team members will need programming skills to understand the trade-offs of bypassing the Python virtual machine for better performance.

This article is excerpted from High Performance Programming in Python

High Performance Programming in Python

By Ian Ozsvald

Click on the cover to buy the paper book

A deep understanding of Python implementations will make your Python code run faster. It’s not enough for Python code to run correctly; you need to make it run faster. This book helps you gain a deeper understanding of Python implementations by exploring the underlying theory behind design decisions. You’ll learn how to find performance bottlenecks and how to significantly speed up code in applications with large data volumes. How can you take advantage of multi-core architecture or clustering? How do you build a system that scales without losing reliability? Experienced Python programmers will learn concrete solutions to these and other problems, as well as success stories from companies that use high-performance Python programming in social media analytics, productized machine learning, and other scenarios.

Today’s topic

Crack down on fake goods, tell me what fake goods you have bought? Deadline: 17:00, March 16, leave a message + forward this activity to the moments of friends, xiaobian will select a reader to give a book asynchronous.

Stretch recommended

New book in February 2018

A blockbuster book for January 2018

Primary school students start learning Python, the closest programming language to AI: A Wave of Python books from Amway

Policy warming: Everyone is learning big data, a wave of good books recommended

Selenium Automation is a Python based test book for Selenium

Eight new books. Send one you like

AI | classic books (introduction to artificial intelligence which books to read?

Click on keywords to read more new books:

Long press the QR code, you can follow us yo every day to share IT good articles with you.

If you reply “follow” in the background of “Asynchronous books”, you can get 2000 online video courses for free. Recommend friends to pay attention to according to the prompts to get a gift book link, free of charge asynchronous books. Come and join us!

Scan the qr code above and reply “follow” to participate in the event!

Read the article below to see more

Read the original

1.1 Basic computer systems

1.2 Assemble the basic elements together

1.3 Why Python

Related Posts

Based on PaddleNLP Chinese news text title classification

Good news for AI developers! Ali Cloud launched the first GPU-optimized container based on Nvidia NGC in China

33 data sets commonly used in machine learning