Cry out for productivity at the expense of performance

Let me take a break from my discussion of asyncio, the standard library in Python, to talk about something I’ve been thinking about lately: the speed of Python. For those of you who don’t know me, I’m a Python fan, and I actively use Python everywhere I can think of. One of the biggest complaints people have about Python is that it is slow, and some people even refuse to try Python because it is slower than other languages. Here’s why I think you should try Python, even though it’s a bit slow.

Speed no longer matters

In the past, programs took a long time to run, cpus were expensive, and memory was expensive. The running time of a program is an important metric. Computers are very expensive, and the electricity they need to run them is also very expensive. Optimizing these resources is due to a timeless business principle:

Optimize your most expensive resources.

In the past, the most expensive resource was the running time of the computer. This is what has led computer science to focus on the efficiency of different algorithms. However, this is no longer true, because silicon chips are now cheap, really cheap. Uptime is no longer your most expensive resource. The company’s most expensive resource is now its employee time. Or in other words, you. Getting things done is more important than getting them done quickly. In fact, it’s so important that I’ll put it here again as if it were a quotation (for those who are just skimming) :

Getting things done is more important than doing things quickly.

You might say, “My company cares about speed, I’m developing a Web application, so all response times must be less than x milliseconds.” Or, “We lost customers because they thought our app was too slow.” I’m not saying speed doesn’t matter at all, I’m just saying speed is no longer the most important thing; It’s no longer your most expensive resource.

Speed is the only thing that matters

When you talk about speed in a programming context, you’re usually talking about performance, which is CPU cycles. When your CEO says speed in the context of programming, he’s talking about business speed, and the most important metric is time to market. Basically, it doesn’t matter how fast your product/Web application is. It doesn’t matter what language it’s written in. It doesn’t even matter how much it costs. At the end of the day, the only thing that can make or break your business is the time to market. I’m not just talking about startup ideas — how long it takes you to start making money, but more about the “from idea to customer” time frame. The only way a business can survive is to innovate faster than your competitors. It won’t matter how many great ideas you come up with if your competitor launches ahead of time. You have to be the first to go public, or at least catch up. The minute you slow down, you lose.

The only way a business can survive is to innovate faster than your competitors.

A case study of microservices

Companies like Amazon, Google and Netflix understand the importance of moving fast. They create a business system that can be used to move quickly and innovate quickly. Microservices are a solution to their problems. This article isn’t about whether you should use microservices, but at least understand why Amazon and Google think they should.

Microservices are inherently slow. The main concept of microservices is to break boundaries with network calls. This means that you are turning the function call you are using (several CPU cycles) into a network call. Nothing affects performance more than that. Compared to CPU, network calls are really slow. But these big companies are still choosing to use microservices. No architecture I know of is slower than microservices. The biggest drawback of microservices is performance, but the biggest strength is time to market. By building teams on smaller projects and code bases, a company can iterate and innovate at a faster rate. It just goes to show that very large companies care about time to market, not just startups.

The CPU is not your bottleneck

If you’re writing a web application, such as a Web server, chances are that CPU time isn’t the bottleneck for your application. When your Web server processes a request, it may make several network calls, such as to a database, or to a caching server like Redis. While the services themselves may be fast, network calls to them are slow. Here’s a great blog post about the speed difference for a particular operation. In this article, the authors scale the CPU cycle time to more understandable human time. If a single CPU cycle is equal to one second, a network call from California to New York will take four years. That shows how slow the network call is. With some rough estimates, we can assume that a normal network call within the same data center takes about 3 milliseconds. That’s three months of our “human scale.” Now suppose your program is highly CPU intensive, requiring 100,000 CPU cycles to respond to a single call. That’s the equivalent of just over a day. Now let’s say you are using a language that is five times slower, which will take about five days. Well, compared to our 3 months of network calls, the 4 day difference doesn’t seem very important. If someone has to wait at least 3 months for a package, I don’t think the extra 4 days really matters to them.

The bottom line is that Python is slow, but that doesn’t matter. Language speed (or CPU time) is almost never an issue. Google actually did a study on this concept, and they published a paper on it. That paper deals with designing high-throughput systems. In conclusion, they say:

Using an interpreted language in a high-throughput environment may seem contradictory, but we’ve found that CPU time is hardly a limiting factor; The expressiveness of the language means that most programs are source programs, and most of their time is spent reading and writing I/O and native runtime code. Moreover, interpreted languages can be useful both as easy experiments at the linguistic level and as methods that allow us to explore distributed computing on many machines,

Again:

CPU time is hardly a limiting factor.

What if CPU time is an issue?

You might say, “That’s all very well and good, but we’ve had some problems where the CPU has been a bottleneck and our Web applications have been slow,” or “X takes less hardware to run on the server than Y.” All of this could be true. The wonderful thing about Web servers is that you can load balance them almost indefinitely. In other words, more hardware can be invested in the Web server. Of course, Python may require better hardware resources than other languages, such as C. Just throwing hardware at the CPU problem. Hardware is very cheap compared to your time. If you save two weeks of productivity time in a year, that will more than pay for the increased hardware costs.

So, is Python faster?

In this post, I’ve been talking about the most important thing is development time. So the question remains: Is Python faster than other languages in terms of development time? By convention, I, Google, and a few others can tell you how efficient Python is. It abstracts out a lot of things for you, helps you focus on where you really should be writing code, and doesn’t get bogged down in the weeds of trivial things like whether you should use a vector or an array. But you may not like just hearing this from other people, so let’s look at some more empirical data.

For the most part, the debate over whether Python is a more efficient language boils down to a scripting (or dynamic) versus statically typed language. I think it’s generally accepted that statically typed languages are less productive, but there’s an excellent paper that explains why that’s not the case. As far as Python is concerned, here’s a study that examines how long it takes different languages to write code for string processing, for reference.

In the above study, Python was twice as efficient as Java. There are other studies that show similar things. Rosetta Code takes an in-depth look at programming language differences. In their paper, they compared Python to other scripting/interpreted languages and concluded that:

Python is more compact, even compared to functional languages (on average 1.2 to 1.6 times shorter)

The general trend seems to be that there are always fewer lines of code in Python. Lines of code may sound like a scary metric, but several studies, including the two already mentioned above, have shown that each line of code in each language takes about the same amount of time. Therefore, limiting the number of lines of code increases productivity. Even codinghorror (a C# programmer) himself wrote an article about how Python can be more efficient.

I think it’s fair to say that Python is more efficient than many other languages. This is mainly due to the large number of native and third-party libraries available in Python. Here is a simple article that discusses the differences between Python and other languages. If you’re wondering why Python is so small and efficient, I invite you to take the opportunity to learn a little Python and practice it yourself. Here’s your first program:

    
  1. import __hello__

But what if speed really matters?

The tone of the argument may suggest that optimization and speed are not important at all. But the truth is, there are times when runtime performance really matters. An example is if you have a Web application where a particular endpoint takes a long time to respond. You know how fast the program needs to go, and you know how much it needs to improve.

In our example, two things happen:

  1. We noticed that one endpoint was performing slowly.
  2. We recognize it as slow because we have a measure of whether it’s fast enough, and it doesn’t meet that measure.

We don’t have to fine-tune everything in the application, just make each one “fast enough.” If an endpoint takes a few seconds to respond, your users may notice, but they won’t notice that you’ve reduced the response time from 35 milliseconds to 25 milliseconds. “Good enough” is all you need to do. Disclaimer: I should say that some applications, such as real-time bidding programs, do need to be finely tuned, and every millisecond counts. But that’s the exception, not the rule.

To understand how to optimize endpoints, your first step will be to configure the code and try to figure out where the bottlenecks are. After all,

Any improvement beyond bottlenecks is an illusion. Any improvements made anywhere besides the bottleneck are an illusion. — Gene Kim

If your optimizations don’t hit a bottleneck, you’re just wasting your time and not solving the real problem. Until you optimize the bottleneck, you won’t get any significant improvements. If you try to optimize before you know what the bottleneck is, you’ll only end up playing with parts of the code. Optimizing code before measuring and identifying bottlenecks is referred to as “premature optimization.” Donald Knuth is often quoted as saying this, but he claims he actually heard it from someone else:

Premature optimization is the root of all evil.

A more complete quote from Donald Knuth when it comes to maintaining a code base is:

97% of the time, we should forget about trivial efficiency: premature optimization is the root of all evil. However, at 3% of the key, we should not miss the opportunity to optimize. – Donald Knuth

In other words, what he’s saying is that you should forget to optimize your code most of the time. It’s almost always good enough. In cases where it’s not good enough, we typically only need to touch 3% of the code path. For example, because you used an if statement instead of a function, your endpoint is a few nanoseconds faster, but that won’t win you any prizes.

Premature optimization involves calling some faster function, or even using a particular data structure, because it’s usually faster. Computer science says that if one method or algorithm has the same asymptotic growth (or big-O) as another, they are equivalent, even if they are twice as slow in practice. Computers are so fast that the growth of the algorithm as data/usage increases far exceeds the actual speed itself. In other words, if you have two order log n functions, but one is twice as slow, it doesn’t really matter. As the data size increases, they all “slow down” at the same rate. This is why premature optimization is the root of all evil; It wastes our time and almost never really contributes to our performance improvement.

In terms of big-o, you can think of all languages as O(n) for your program, where N is the number of lines of code or instruction. For the same instruction, they grow at the same rate. For incremental growth, it doesn’t matter how fast a language is, all languages are the same. Under this logic, you could say that choosing a language for your application simply because it’s “fast” is the ultimate form of premature optimization. You pick something that is expected to be fast, without measuring it, without understanding where the bottleneck is going to be.

Choosing a language for your application simply because it is “fast” is the ultimate form of premature optimization.

Optimize the Python

One of my favorite things about Python is that it lets you optimize your code a little bit at a time. Suppose you have a Python method and you find it to be your bottleneck. You’ve tuned it a few times, probably following some instructions here and there, and now you’re pretty sure Python itself is your bottleneck. Python has the ability to call C code, which means you can override this method in C to reduce performance problems. You can override this method one at a time. This procedure allows you to write well-optimized bottleneck methods in any language that can be compiled into a C-compatible assembler. This allows you to write in Python most of the time and only write in lower-level languages when necessary.

There is a programming language called Cython, which is a superset of Python. It is almost a merger of Python and C, and is an asymptotically typed language. Any Python code is valid Cython code, and Cython code can be compiled into C code. With Cython, you can write a module or a method and progress to more and more C types and capabilities. You can mix C types with Python duck types. With Cython, you can get the perfect mix of mixes, optimizing only at bottlenecks, while not losing the beauty of Python everywhere else.

A screenshot from EVE Online: Space MMO written in Python.

When you finally run into Python’s performance problems, you don’t need to write your entire code base in a different language. You can almost get the performance you need by rewriting a few functions in Cython. That’s the strategy eve online is taking. It is a large multi-player computer game that uses Python and Cython throughout its architecture. They achieve game-level performance by optimizing bottlenecks in C/Cython. If this strategy works for them, it should help anyone. Or, there are other ways to optimize your Python. For example, PyPy is a JIT implementation of Python that provides important runtime improvements (such as web servers) for long-running applications by using PyPy to replace CPython, which is the default implementation of Python.

Let’s review the main points:

  • Optimize your most expensive resources. That’s you, not the computer.
  • Choose a language/framework/architecture to help you develop quickly (e.g. Python). Don’t choose technologies just because they are fast.
  • When you have a performance problem, find the bottleneck.
  • Your bottleneck is probably not the CPU or Python itself.
  • If Python becomes a bottleneck for you (and you’ve optimized your algorithm), then you can turn to the popular Cython or C.
  • Enjoy the pleasure of getting things done quickly.

I hope you enjoyed reading this article as much as I enjoyed writing it. If you want to say thank you, please give me a thumbs up. Also, if you want to talk to me about Python at some point, you can tweet me on Twitter (@nhumrich) or you can find me on the Python Slack Channel.


About the author:

Nick Humrich — Adheres to the continuous delivery approach and has written many tools for it. He is also a Python hacker and tech enthusiast, and currently works as a DevOps engineer.

(Photo: Pixabay, CC0)

Via: medium.com/hacker-dail…

Author: Nick Humrich translator: zhousiyu325 proofread: jasminepeng

This article is originally compiled by LCTT and released in Linux China