With the rise of machine learning, Python is becoming the “most popular” language. It’s easy to use, logical, and has a huge number of extensions, making it the language of choice not only for machine learning and data science, but also for web pages, data crawling, and scientific research. Furthermore, many entry-level machine learning developers follow the trend of choosing Python, but why they choose Python at all is at the heart of this article.

The purpose of this tutorial is to convince you of two things: First, Python is a great programming language; Second, if you’re a scientist, Python is probably worth learning. This tutorial is not intended to show that Python is a universal language; Instead, the author explicitly discusses several situations in which Python is not a wise choice. The purpose of this tutorial is simply to provide a comment on some of Python’s core features and to illustrate its advantages as a general-purpose scientific computing language over the commonly used alternatives, most notably R and Matlab.

The rest of this tutorial assumes that you already have some programming experience, and it should be very easy to understand if you are proficient in other data-centric languages such as R or Matlab. This tutorial does not count as an introduction to Python, and focuses on why you should learn Python rather than how to write Python code (although there are plenty of excellent tutorials elsewhere).

An overview of the

Python is a widely used, easy-to-learn, high-level, general-purpose dynamic programming language. This is satisfying, so let’s talk about some features separately.

Python is (relatively) easy to learn

Programming is hard, so in absolute terms, programming languages are hard to learn unless you already have programming experience. However, Python’s advanced properties (see the next section), syntactical readability, and semantic straightness make it relatively easier to learn than other languages. For example, here is the definition (intentionally uncommented) of a simple Python function that converts a string of English words to (crummy) Pig Latin:

def pig_latin(text): ''' Takes in a sequence of words and converts it to (imperfect) pig latin. ''' word_list = text.split(' ') output_list =  [] for word in word_list: word = word.lower() if word.isalpha(): first_char = word[0] if first_char in 'aeiou': word = word + 'ay' else: word = word[1:] + first_char + 'yay' output_list.append(word) pygged = ' '.join(output_list) return pyggedCopy the code

The above functions don’t actually produce a perfectly valid Pig Latin (assuming there is a “valid Pig Latin”), but that doesn’t matter. There are cases where it works:

test1 = pig_latin("let us see if this works")

print(test1)
Copy the code

Pig Latin aside, the point here is that the code is easy to read for several reasons. First, the code is written in high-level abstraction (more on that below), so each line of code maps to a fairly straightforward operation. These operations can be “take the first character of the word”, rather than mapping to a less intuitive low-level operation such as “reserve one byte of memory for a character and I will pass in a character later”. Second, control structures (for — loops, if — then conditions, etc.) use simple words like ‘in’, ‘and’ and ‘not’, which are semantically relatively close to their natural English meanings. Third, Python’s strict control over indentation imposes a specification that makes code readable while preventing some common errors. Fourth, there is a strong emphasis in the Python community on following style rules and writing “Python-like” code, which means that Python programmers tend to use consistent naming conventions, line lengths, programming conventions, and many other similar features more than programmers in other languages. Together, they make other people’s code more readable (although this is arguably a feature of the community rather than the language itself).

Python is a high-level language

Python is a relatively “high-level” language compared to many other languages: it doesn’t require (and in many cases, doesn’t allow) users to worry too much about low-level details that many other languages have to deal with. For example, suppose we want to create a variable named “my_box_of_things” as a container for things we use. We don’t know in advance how many objects we want to keep in the box, and we want the number of objects to increase or decrease automatically as we add or remove objects. So the box needs to occupy a variable amount of space: at one point in time, it might contain eight objects (or “elements”), and at another point in time, 257 objects. In the underlying language such as C, this simple request has brought some complexity to our program, because we need statement box need to occupy how much space in advance, and then every time we want to increase box need space, we need to explicitly create a room for more new box, and then everything will be copied to them.

In Python, by contrast, we don’t need to worry about this when writing in a high-level language, even though these processes more or less happen (less efficiently) at the bottom. From our point of view, we can create our own box and add or remove objects as we like:

# Create a box (really, a 'list') with 5 things# Create  
my_box_of_things = ['Davenport', 'kettle drum', 'swallow-tail coat', 'table cloth', 'patent leather shoes']

print(my_box_of_things)

['Davenport', 'kettle drum', 'swallow-tail coat', 'table cloth', 'patent leather shoes']

# Add a few more things
my_box_of_things += ['bathing suit', 'bowling ball', 'clarinet', 'ring']

# Maybe add one last thing
my_box_of_things.append('radio that only needs a fuse')

# Let's see what we have...
print(my_box_of_things)
Copy the code

More generally, Python (and all other high-level languages by definition) tends to hide various rote declarations that need to be expressed explicitly in the underlying language. This allows us to write very compact, clean code (although it usually comes at the expense of performance because the internals are no longer accessible, so optimization becomes more difficult).

For example, consider the deceptively simple act of reading plain text from a file. Conceptually, it seems to be possible to do just two simple things for a developer who is suffering from direct contact with a file system: first open a file and then read from it. The actual process is much more than that, and languages lower down than Python often force (or at least encourage) us to acknowledge this. For example, here’s the canonical (though certainly not the cleanest) way to read from a file in Java:

import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; public class ReadFile { public static void main(String[] args) throws IOException{ String fileContents = readEntireFile("./foo.txt"); } private static String readEntireFile(String filename) throws IOException { FileReader in = new FileReader(filename); StringBuilder contents = new StringBuilder(); char[] buffer = new char[4096]; int read = 0; do { contents.append(buffer, 0, read); read = in.read(buffer); } while (read >= 0); return contents.toString(); }}Copy the code

You can see the agonizing things we have to do, such as importing file readers, creating a cache for the contents of files, reading file blocks as blocks and assigning them to the cache, and so on. In Python, by contrast, reading the entire contents of a file requires only the following code:

# Read the contents of "hello_world.txt"
text = open("hello_world.txt").read()
Copy the code

Of course, this brevity is not unique to Python; There are many other high-level languages that also hide most of the nasty internal processes implied by simple requests (e.g., Ruby, R, Haskell, etc.). However, relatively few other languages can match the Python features discussed next.

Python is a general purpose language

By design, Python is a general-purpose language. That is, it is designed to allow programmers to write almost any type of application in any domain, rather than focusing on a specific set of problems. In this respect, Python can be contrasted with (relative to) domain-specific languages, such as R or PHP. These languages can be used in many situations in principle, but are explicitly optimized for specific use cases (in both cases, statistics and network back-end development, respectively).

Python is often affectionately as “the language of the second best of all things”, it captured the mood very well, even though in many cases a Python is not the best language for a specific problem, but it usually has enough flexibility and good support, make people can still relatively effectively solve the problem. The fact that Python can be used effectively for many different applications makes learning Python a fairly valuable thing to do. Because as a software developer, it’s great to be able to do everything in a single language, rather than having to switch between languages and environments depending on the project you’re executing.

The standard library

This is probably the easiest way to understand Python’s versatility by browsing through the list of many modules available in the standard library, the Python interpreter’s own toolset (no third-party packages installed). Consider the following examples:

os: System tools re: regular expression of collections: useful data structures multiprocessing: simple parallelization: pickle: simple serialization: json: read and write json: argparse: command line parameter parsing functools: Functional programming tools Datetime: Date and time functions cProfile: Basic tools for analyzing code

This list is not impressive at first glance, but it is a relatively common experience for Python developers to use them. Many times when we Google a seemingly important or even esoteric problem, we are likely to find a built-in solution hidden in a library module.

JSON, easy way

For example, suppose you want to read some JSON data from web.JSON, as follows:

data_string = ''' [ { "_id": "59ad8f86450c9ec2a4760fae", "name": "Dyer Kirby", "registered": "2016-11-28T03:41:29 +08:00", "latitude": -67.170365, "longitude": 130.932548, "favoriteFruit": "durian"}, {"_id": "59ad8f8670df8b164021818d", "name": "Kelly Dean", "registered": "2016-12-01T09:39:35 +08:00", "latitude": }] ", "longitude", "longitude": "longitude", "favoriteFruit": "durian"}]"Copy the code

We can spend some time writing json parsers ourselves, or try to find a third-party package that reads JSON efficiently. But we’re probably wasting our time, because Python’s built-in JSON module already does what we need:

import json data = json.loads(data_string) print(data) ''' [{'_id': '59ad8f86450c9ec2a4760fae', 'name': 'Dyer Kirby', 'registered': '2016-11-28T03:41:29 +08:00', 'latitude': -67.170365, 'longitude': 130.932548, 'favoriteFruit': 'durian'}, {'_id': '59ad8f8670df8b164021818d', 'name': 'Registered ': '2016-12-0t09:39:35 +08:00', 'latitude': -82.227537, 'longitude': -175.053135, 'favoriteFruit': 'durian'}]Copy the code

Note that before we can use loads in json modules, we have to import json modules. This pattern of having to explicitly import almost all functional modules into the namespace is important in Python, and the list of built-in functions available in the base namespace is very limited. Many developers who have worked with R or Matlab will be annoyed at first because the global namespace of both packages contains hundreds or even thousands of built-in functions. But once you get used to typing a few extra characters, it makes the code easier to read and manage, and the risk of naming conflicts (which are common in R) is greatly reduced.

Excellent external support

Of course, just because Python provides a lot of built-in tools to do a lot of things doesn’t mean you always need to use them. Arguably a bigger selling point than Python’s rich standard library is the large Python developer community. Python has been the most popular dynamic programming language in the world for many years, and the developer community has contributed many high-quality installments.

The following Python packages provide widely used solutions in different areas (this list may be out of date by the time you read this!) :

Web and API development: Flask, Django, Falcon, Hug Crawl data and parse text/tags: Requests, BeautifulSoup, Scrapy Natural Language Processing (NLP) : NLTK, GenSIM, TextBloB Numerical Computation and Data Analysis: NUMPY, SCIPY, PANDAS, XArray Machine Learning: SciKit-Learn, Theano, Tensorflow, Keras Image Processing: Pillow, Scikit-image, OpenCV mapping: Matplotlib, Seaborn, GGplot, Bokeh, etc

One of Python’s strengths is its excellent package management ecosystem. Although installing packages in Python is generally more difficult than in R or Matlab, this is mainly because Python packages tend to be highly modular and/or more dependent on system libraries. But in principle at least most Python packages can be installed at the command prompt using the PIP package manager. More sophisticated installers and package managers such as Anaconda also greatly reduce the pain of configuring a new Python environment.

Python is a (relatively) fast language

This may come as a bit of a surprise: on the face of it, the idea that Python is a fast language looks silly. This is because Python tends to lag when compared to compiled languages like C or Java during standard testing. No doubt, if speed is of the essence (for example, if you’re writing a 3D graphics engine or running large-scale fluid dynamics simulations), Python may not be your best, or even second-best, language of choice. In practice, however, the limiting factor in many scientists’ workflows is not uptime but development time. A script that takes an hour to run but only five minutes to write is usually more palatable than a script that takes five seconds to run but takes a week to write and debug. Furthermore, as we’ll see below, even if the code we use is all written in Python, some optimizations can often make it run almost as fast as a C-based solution. In fact, the number of cases where python-based solutions are not fast enough for most scientists is rare, and the number of cases is decreasing dramatically as tools improve.

Don’t do work twice

A general rule of software development is that you should avoid reinventing the wheel as much as possible. Of course, sometimes it’s unavoidable, and in many cases it makes sense to write your own solution to a problem or create an entirely new tool. But in general, the less Python code you write yourself, the better performance you’ll get. There are several reasons for this:

Python is a mature language, so many existing packages have a large user base and are heavily optimized. For example, this is true for most of the core science libraries in Python (Numpy, SCIpy, Pandas, etc.). Most Python packages are actually written in C, not Python. For most standard libraries, when you call a Python function, there’s actually a good chance that you’re running C code with a Python interface. This means that no matter how clever your algorithm for solving a problem is, if you write it all in Python and the built-in solution is written in C, you may not perform as well as the built-in solution. For example, here’s running the built-in sum function (written in C) :

# Create a list of random floats
import random
my_list = [random.random() for i in range(10000)]

# Python's built-in sum() function is pretty fast
%timeit sum(my_list)

47.7 µs ± 4.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Copy the code

Algorithmically, there’s not much you can do to speed up the summation of arbitrary lists of values. So you might be thinking what the heck, you could probably write your own summation in Python, and perhaps encapsulate the overhead of the built-in sum function in case it does any internal validation. HMM… Not so.

def ill_write_my_own_sum_thank_you_very_much(l): s = 0 for elem in my_list: S += elem return s %timeit ill_write_my_OWn_SUM_THANK_you_very_much (my_list) 331 µs ± 50.9 µs per loop (mean ± std.dev.  of 7 runs, 1000 loops each)Copy the code

At least in this case, running your own simple code is probably not a good solution. But that doesn’t mean you have to use the built-in sum function as a performance ceiling in Python! Because Python is not optimized for numerical operations that involve large inputs, the built-in methods are suboptimal for adding large lists. What we should do in this case is ask, “Are there any other Python libraries available for numerical analysis of potentially large inputs?” As you might expect, the answer is yes: The NumPy package is a major component of Python’s science ecosystem, and the vast majority of scientific computing packages in Python are built on top of NumPy in some way, and it contains all sorts of computing functions that help us.

In this case, the new solution is very simple: If we convert a pure Python list to a NumPy array, we can immediately call NumPy’s sum method, which we might expect to be faster than the core Python implementation (technically, we could pass a Python list into numpy.sum, It implicitly converts it to an array, but if we want to reuse the NumPy array, we’d better convert it explicitly.

Import numpy as NP my_arr = Np.array (my_list) %timeit np.sum(my_arr) 7.92 µs ± 1.15 µs per loop (mean ± std.dev.of 7 runs, 100000 loops each)Copy the code

So simply switching to NumPy speeds up list sums by an order of magnitude without having to implement anything yourself.

Need more speed?

Of course, sometimes even with all the C-based extensions and highly optimized implementations, your existing Python code won’t cut time fast enough. In this case, your knee-jerk reaction might be to give up and switch to a “real” language. And often, it’s a perfectly reasonable instinct. But before you start porting code to C or Java, you need to consider some less taxing methods.

Write C code in Python

First, you can try writing Cython code. Cython is a Superset of Python that allows you to embed (some) C code directly into Python code. Cython does not run compiled; instead, your Python file (or a specific part of it) will be compiled into C code before it is run. The practical result is that you can go on writing code that looks almost exactly like Python and still get a performance boost from the reasonable introduction of C code. In particular, simply providing c-type declarations can often significantly improve performance.

Here is the Cython version of our simple addition code:

# Jupyter extension that allows us to run Cython cell magics %load_ext Cython The Cython extension is already loaded. To  reload it, use: %reload_ext Cython %%%%cythoncython defdef ill_write_my_own_cython_sum_thank_you_very_muchill_write (list arr): cdef int N = len(arr) cdef float x = arr[0] cdef int i for i in range(1 ,N): X += arr[I] return x %timeit ill_write_MY_OWn_cython_SUM_THANK_you_verY_much (my_list) 227 µs ± 48.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)Copy the code

There are a few things to note about the Cython version. First, the first time you execute the unit that defines the method, it takes little (but notable) time to compile. That’s because, unlike pure Python, code is not interpreted line by line as it executes; In contrast, Cython-like functions must be compiled into C code before they can be called.

Second, while the Cython-like summation functions are faster than the simple Python summation functions we wrote above, they are still much slower than the built-in summation methods and NumPy implementations. However, this result speaks more strongly about the nature of our particular implementation process and problem than about the general benefits of Cython; In many cases, an effective Cython implementation can easily increase the running time by an order of magnitude or two.

Use NUMBA for cleanup

Cython is not the only way to improve Python’s internal performance. Another, simpler approach from a development perspective is to rely on just-in-time compilation, where a piece of Python code is compiled into optimized C code the first time it is called. In recent years, great progress has been made with the Python just-in-time compiler. Perhaps the most sophisticated implementation can be found in the NUMba package, which provides a simple JIT modifier that can easily be combined with any other method.

Our previous example didn’t emphasize how much impact JITs can have, so let’s move on to a slightly more complex problem. Here we define a new function called multiply_crystals, which takes as input a one-dimensional array of floating-point numbers and multiplicates each element in the array with any other randomly selected element. And then it returns the sum of all randomly multiplied elements.

Let’s start by defining a simple implementation that doesn’t even use vectorization in place of random multiplication. Instead, we simply iterate over each element in the array, pick one other element at random, multiply the two elements and assign the result to a specific index. If we test this function with benchmark problems, we see that it runs rather slowly.

import numpy as np def multiply_randomly_naive(l): n = l.shape[0] result = np.zeros(shape=n) for i in range(n): ind = np.random.randint(0, N) result[I] = l[I] * l[ind] return np. Sum (result) %timeit multiply_randomly_naive(my_arr) 25.7ms ± 4.6ms per loop (mean ± STD. Dev. Of 7 runs, 10 loops each)Copy the code

Before we compile in real time, we should first ask ourselves if the above functions could have been written in a more NumPy form. NumPy is optimized for array-based operations, so loops should be avoided at all costs, as they can be very slow. Fortunately, our code is very vectorized (and easy to read) :

def multiply_randomly_vectorized(l): n = len(l) inds = np.random.randint(0, n, Size =n) result = L * L [inds] return np. Sum (result) %timeit multiply_randomly_vectorized(my_arr) 234 µs ± 50.9 µs per Loop (mean ± std.dev. Of 7 runs, 1000 loops each)Copy the code

On the author’s machine, the vectorized version ran about 100 times faster than the looping version of the code. This performance difference between looping and array operations is typical for NumPy, so we need to think algorithmically about the importance of what you do.

Suppose that instead of spending time refactoring our naive, slow implementation, we simply add a modifier to our function to tell the Numba library that we want to compile the function to C the first time we call it. Literally, the only difference between the following function multiply_randomly_naive_jit and the function multiply_randomly_naive defined above is the @JIT modifier. Of course, four small characters can’t make that much of a difference. Isn’t it?

import numpy as np from numba import jit @jit def multiply_randomly_naive_jit(l): n = l.shape[0] result = np.zeros(shape=n) for i in range(n): ind = np.random.randint(0, N) result[I] = l[I] * l[ind] return np. Sum (result) %timeit multiply_randomly_naive_jit(my_arr) 135 µs ± 22.4 µs per loop (mean ± std.dev. Of 7 runs, 1 loop each)Copy the code

Surprisingly, the jIT-compiled version of the naive function actually runs faster than the vectorized version.

Interestingly, applying the @JIT modifier to the vectorized version of the function (leaving it as a link to the reader) doesn’t help much. After the Numba JIT compiler was used for our code, both versions of the Python implementation ran at the same speed. So, at least in this case, just-in-time compilation not only effortlessly gives us C-like speed, but also avoids python-style optimization of code.

This is probably a fairly strong conclusion, since (a) Numba’s JIT compiler only covers a portion of NumPy’s characteristics right now, and (b) there is no guarantee that compiled code will run faster than interpreted code (although this is usually a valid assumption). The real purpose of this example is to remind you that before you claim it’s too slow to do what you want it to do, you actually have a lot of options available to you in Python. It’s worth noting that none of these performance features, such as C integration and just-in-time compilation, are unique to Python. Recent versions of Matlab automatically use just-in-time compilation, while R supports JIT compilation (through external libraries) and C ++ integration (Rcpp).

Python is object-oriented by nature

Even if all you’re doing is writing short scripts to parse text or mine some data, many of Python’s benefits are easy to appreciate. One of Python’s best features may not be obvious until you start writing relatively large snippets of code: Python has a very elegantly designed object-based data model. In fact, if you look underneath, you’ll see that everything in Python is an object. Even functions are objects. When you call a function, you are actually calling the call method that every object in Python runs:

def double(x):
    return x*2

# Lists all object attributes
dir(double)

['__annotations__',
 '__call__',
 '__class__',
 '__closure__',
 '__code__',
 '__defaults__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__globals__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__kwdefaults__',
 '__le__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']
Copy the code

In fact, because everything in Python is an object, everything in Python follows the same core logic, implements the same basic API, and extends in a similar way. The object model also happens to be very flexible: you can easily define new objects to do interesting things while still behaving relatively predictably. Perhaps not surprisingly, Python is also an excellent choice for writing domain-specific languages (DSLs), because it allows users to largely reload and redefine existing functionality.

Magic methods

A core part of the Python object model is that it uses “magic” methods. These special methods implemented on objects can change the behavior of Python objects — often in important ways. Magic methods usually start and end with double underlining, and in general, don’t tamper with them unless you know what you’re doing. But once you actually start changing, you can do some pretty amazing things.

As a simple example, let’s define a new Brain object. First of all, Barin doesn’t do anything, it just sits there in a polite daze.

class Brain(object):

    def __init__(self, owner, age, status):

        self.owner = owner
        self.age = age
        self.status = status

    def __getattr__(self, attr):
        if attr.startswith('get_'):
            attr_name = attr.split('_')[1]
            if hasattr(self, attr_name):
                return lambda: getattr(self, attr_name)
        raise AttributeError
Copy the code

In Python, the __init__ method is the object’s initialization method — it is called when we try to create a new instance of Brain. You usually need to implement __init__ yourself when you write a new class, so __init__ probably looks familiar if you’ve seen Python code before and won’t be covered in this article.

By contrast, most users rarely explicitly implement __getattr__ methods. But it controls a very important part of the behavior of Python objects. Specifically, the __getAttr__ method is called when the user attempts to access a class property through a dot syntax (such as brain.owner) that does not actually exist. The default action for this method is simply to raise an error:

# Create a new Brain instance
brain = Brain(owner="Sue", age="62", status="hanging out in a jar")

print(brain.owner)

---------------------------------------------------------------------------
sue

print(brain.gender)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-136-52813a6b3567> in <module>()
----> 1 print(brain.gender)

<ipython-input-133-afe64c3e086d> in __getattr__(self, attr)
     12             if hasattr(self, attr_name):
     13                 return lambda: getattr(self, attr_name)
---> 14         raise AttributeError

AttributeError:
Copy the code

The important thing is that we don’t have to put up with it. Suppose we wanted to create an alternative interface for retrieving data from inside the Brain class through getter methods that begin with “get” (common practice in many other languages), we could certainly implement getter methods explicitly by name (get_owner, get_age, and so on). But suppose we’re lazy and don’t want to write an explicit getter for each property. In addition, we might want to add new properties to the Brains class we’ve already created (e.g., brain.foo = 4), in which case we don’t need to create getters for those unknown properties in advance (note that in the real world, these are terrible reasons why we’re going to do this next; Of course, this is purely for illustration). What we can do is change the behavior of the Brain class by instructing it to do something when the user requests any property.

In the code snippet above, our getattr implementation first checks the name of the incoming property. If the name begins with get_, we check to see if the name of the desired property exists within the object. If it does exist, the object is returned. Otherwise, we will raise the wrong default action. This allows us to do things that seem crazy, like:

print(brain.get_owner())
Copy the code

Other magical methods allow you to dynamically control various other aspects of an object’s behavior in ways you can’t in many other languages. In fact, because everything in Python is an object, even mathematical operators are actually secret method calls to objects. For example, when you write the expression 4 + 5 in Python, you are actually calling __add__ on the integer object 4 with an argument of 5. If we want to (and we should exercise this right carefully! What we can do is create new domain-specific “mini-languages” that inject whole new semantics into general-purpose operators.

As a simple example, let’s implement a new class representing the volume of a single Nifti. We’ll rely on inheritance for most of our work; Simply inherit the NiftierImage class from the nibabel package. We need to do is to define and and or method, they were mapped to & and | operator. See if you understand what this code does before executing the following units (you may need to install packages such as Nibabel and Nilearn).

from nibabel import Nifti1Image from nilearn.image import new_img_like from nilearn.plotting import plot_stat_map import  numpy as np import matplotlib.pyplot as plt %matplotlib inline class LazyMask(Nifti1Image): ''' A wrapper for the Nifti1Image class that overloads the & and | operators to do logical conjunction and disjunction on the image data. ''' def __and__(self, other): if self.shape ! = other.shape: raise ValueError("Mismatch in image dimensions: %s vs. %s" % (self.shape, other.shape)) data = np.logical_and(self.get_data(), other.get_data()) return new_img_like(self, data, self.affine) def __or__(self, other): if self.shape ! = other.shape: raise ValueError("Mismatch in image dimensions: %s vs. %s" % (self.shape, other.shape)) data = np.logical_or(self.get_data(), other.get_data()) return new_img_like(self, data, self.affine) img1 = LazyMask.load('image1.nii.gz') img2 = LazyMask.load('image2.nii.gz') result = img1 & img2 fig, axes = plt.subplots(3, 1, figsize=(15, 6)) p = plot_stat_map(img1, cut_coords=12, display_mode='z', title='Image 1', axes=axes[0], vmax=3) plot_stat_map(img2, cut_coords=p.cut_coords, display_mode='z', title='Image 2', axes=axes[1], vmax=3) p = plot_stat_map(result, cut_coords=p.cut_coords, display_mode='z', title='Result', axes=axes[2], vmax=3)Copy the code

The Python community

The last feature of Python I mentioned here is its excellent community. Of course, each of the major programming languages has a large community dedicated to developing, applying, and promoting that language; It’s about who the people are in the community. In general, communities around programming languages better reflect users’ interests and expertise. For relatively domain-specific languages like R and Matlab, this means that a large proportion of the people contributing new tools to the language are not software developers, but more likely statisticians, engineers, scientists and so on. There is nothing wrong with statisticians and engineers, of course. For example, one of the advantages of an R ecosystem with more statisticians than other languages is that R has a range of statistical packages.

However, the disadvantage of a community dominated by users with statistical or scientific backgrounds is that these users are often untrained in software development. As a result, the code they write tends to be of low quality (from a software perspective). Best practices and habits commonly adopted by professional software engineers do not stand out in this untrained community. For example, many of the R packages CRAN provides lack something like automated testing — almost unheard of except for the smallest Python packages. Also stylistically, R and Matlab programmers tend to write code that is less consistent from person to person. As a result, all other things being equal, software written in Python tends to be more robust than code written in R. While this advantage of Python no doubt has nothing to do with the inherent characteristics of the language itself (one can write extremely high quality code in any language, including R, Matlab, etc.), it is still the case that, Developer communities that emphasize common practices and best practices often result in clearer, more disciplined, and higher-quality code.

conclusion

Python is awesome.

This article is from the cloud community partner “CDA Data Analyst”. For relevant information, you can pay attention to “CDA Data Analyst”.