- Which Should You Use: Asynchronous Programming or multi-threading?
- By Patrick Collins
- The Nuggets translation Project
- Permanent link to this article: github.com/xitu/gold-m…
- Translator: chaingangway
- Proofreader: QinRoc, PingHGao
Asynchronous programming or multithreading, which solution should I choose?
In software engineering, the two concepts are often confused. They are both solutions for concurrency, but they are different technologies, and they are used in different ways and scenarios.
The simple explanation for the difference is that threads are implementation-specific; Asynchronous programming is task-specific. Let’s take a closer look.
Let’s say I’m making a breakfast of eggs on toast. How do we do it?
Synchronous scheme
The easiest way to do this is in sequence:
1. Take out the eggs, bread and pan, and turn on the stove. 2. Crack the eggs and pour them into the pot. 3. Wait for the eggs to fry. 4. Remove eggs and add seasoning. 5. Place the bread in the toaster. 6. Wait for the toaster to finish. 7. Take out the toast.
Total time to make breakfast: 15 minutes.
Easy, right? If we think of this cooking as analogous to program execution, it’s a synchronous way of cooking breakfast.
We do a series of tasks by one person (serial). We do each step in order, and can’t proceed to the next step without completing the task in progress. Technically, a more strict definition is that each task is suspended by the previous one before execution, and all tasks are completed by only one unit of work. We can’t move forward until we finish the last one. In this example, we are the CPU of the computer. Each task is performed by one person (one CPU).
Okay, we’ve got the eggs. But what if we want to make our mornings more productive? You might say, “I don’t have to wait for my eggs to finish baking.”
You are now thinking like an engineer. Let’s make breakfast again, but this time, we’ll fry eggs and toast at the same time.
Asynchronous scheme
We can still have one person do everything, but while waiting for one task to complete, we can start another. The steps are as follows:
- Take out the eggs, bread and pan, and turn on the stove.
- Break up the eggs and pour them into the pan.
- Wait for the eggs to fry.
- Put the bread in the toaster.
- Wait for the toaster to finish.
- Once the eggs are fried, remove the eggs and add the seasoning.
- When the bread is cooked, remove and toast the bread.
Total time to make breakfast: 8 minutes.
Looks like everything’s the same, right? But there was a major change in one step. We bake the bread while we wait for the eggs to be cooked, not after the eggs are cooked. We still only need one worker to do all the tasks, but now the tasks are asynchronous. It might be less effective to just have two tasks perform asynchronously, but imagine how inefficient it would be to perform one task at a time with thousands of eggs and bread and thousands of frying pans and thousands of toasters!
This is one of the biggest advantages of asynchronous programming. Many times, we have to wait for an uncontrollable action to complete. In this case, fried eggs and toasted bread are waiting. We can have the most efficient, the best chefs in the world, but for the most part, we just sit back and wait for the eggs to fry. The process of waiting for these things to cook is similar to waiting for input/output (I/O) operations, and if we need to wait for a lot of I/O operations to perform a task, then using asynchronous programming is really a good solution.
If we call an API that has to get input from the user, no matter how many processors or fast computers we have, we have to wait. We have to wait for the API call to complete and for the user to enter information. This process is out of our control, and it doesn’t change as processors get faster and faster or dedicated resources are allocated.
So now we have two ways of making eggs. But let’s say our roommate, Kevin, wants to help with the eggs. What plan shall we adopt this time?
Multithreaded scheme
1. Take out the eggs, bread and pan, and turn on the stove. 2. Crack the eggs and pour them into the pot. Kevin puts the bread into the toaster. 3. Wait for the eggs to fry. Kevin waits for the bread to finish baking. 4. Remove eggs and add seasoning. Kevin takes out the bread.
Total time to make breakfast: 8 minutes.
We have two people making breakfast, and each person is assigned a task sequence. This is an example of multithreaded synchronization, because neither you nor Kevin is performing multiple tasks (including waiting) at the same time.
In computer science, a process can have one or more threads. In this case, we have two threads (people). You can learn more about what threads are here.
You might say, “Well, if there are two people around, I can only cook breakfast with two people.” The same goes for computers. You can use multithreading only if your computer has enough resources. It is usually to complete some complex task that requires more resources.
Which is better, multithreaded or asynchronous? This depends on a number of factors, and if you understand how they work, you should know which one to use. If you have a lot of I/ OS, you might want to use asynchrony. If you have intensive computations, maybe you should use multithreading. It’s much easier to write multithreaded programs than asynchronous programs, but keep in mind that the difficulty depends on the programming language. Can we implement a multithreaded asynchronous system? Of course you can! The two can be used together.
The Python example
Let’s look at how the three examples above (singleton, singleton asynchrony, and multithreaded synchronization) are implemented in Python.
We used different methods to get stock data from the Alpha Vantage API. You can install this using the Python wrapper PIP Install alpha_Vantage.
synchronous
We will get the current prices of four stocks with codenames ‘AAPL’, ‘GOOG’, ‘TSLA’, and ‘MSFT’ and print them out as we get all the data. The simplest way is to use a for loop.
from alpha_vantage.timeseries import TimeSeries
key = 'API_KEY'
symbols = ['AAPL'.'GOOG'.'TSLA'.'MSFT']
results = []
for symbol in symbols:
ts = TimeSeries(key = key)
results.append(ts.get_quote_endpoint(symbol))
print(results)
Copy the code
This is the most violent way. When we call the API to get a value (done in ts.get_quote_endpoint (symbol)), we print it out and start fetching the next stock data.
But after learning about asynchrony and multithreading, we know that we can start another API call while waiting for a return value.
asynchronous
import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries
symbols = ['AAPL'.'GOOG'.'TSLA'.'MSFT']
async def get_data(symbol):
ts = TimeSeries()
data, _ = await ts.get_quote_endpoint(symbol)
await ts.close()
return data
loop = asyncio.get_event_loop()
tasks = [get_data(symbol) for symbol in symbols]
group1 = asyncio.gather(*tasks)
results = loop.run_until_complete(group1)
print(results)
Copy the code
In Python, we have the keywords await and async, which give us new capabilities for asynchronous programming. These are new features since Python 3.5 and will need to be updated if you are still using Python 2, as many Python 2 features are outdated.
This code might be a little confusing to call, so let’s break it down. A loop is a place where the processor loops between waiting for a task and performing another task. This is to continuously check that tasks (such as our API calls) are complete.
The Tasks variable is a list of method calls. We put these tasks in a list of asynchronous tasks to collect, called group1, and run them in loop.run_until_complete. This is much faster than our previous synchronous version because we can make multiple API calls without waiting for each API to complete.
Note: The Python documentation for Asyncio has a lot of quirks, more details here.
multithreading
I’ve written a few articles about multithreading, but if you want to learn more and see Python examples, follow this link!
There is much more to learn, such as what parallelism is, what the underlying principles of communication between each application are, how to synchronize threads, channels, and how to implement them in different programming languages.
If you think I missed something or don’t understand something, feel free to leave questions, comments or insights below!
If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.
The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.