You are welcome to subscribe to the Python Data Analysis: Building a Quantitative Trading System for Stocks booklet. After learning the knowledge in the booklet, be sure to put it to use to help us analyze stocks!
preface
I’m sure you’ve all heard of quantitative trading.
Quantitative trading is an emerging systematic financial investment method, which integrates the knowledge of multiple disciplines, uses advanced mathematical models to replace people’s subjective thinking to formulate trading strategies, and uses powerful computing power of computers to measure the “probability” of profits and losses of trading strategies from huge historical data of stocks, bonds, futures and so on. Help investors make accurate decisions by managing the “probability” of profit and loss.
So, what is the opening method of quantitative trading suitable for ordinary shareholders?
This paper uses A grounded stock analysis scenario – multi-task crawler to obtain daily real-time market data of A shares to share with you how ordinary shareholders use quantitative trading!
Multiprocess and multithreading
We use “for” when we get stock quotes. In cycle, but in the face of thousands of stocks in the past few years or even more than a decade of market data, the download process is bound to take too long.
My book ** Quantitative Trading in Python Stocks from The Beginning to the Practical ** introduces multi-process and multi-threaded acceleration schemes. When it involves complex calculation and various I/O operations, we can consider using multi-task parallel mode to make full use of CPU multi-core performance to improve the execution efficiency of the program.
In Python, due to the existence of GIL mechanism, multi-processing and multi-threading perform differently in computationally and I/O intensive task scenarios. Multi-threading is more suitable for I/O intensive applications, while multi-process performs better for CPU intensive applications.
In the book, we call THE API interface to obtain stock data, and take this as an example to introduce the for loop, multithreading and multi-process mode respectively.
Traversal obtains one-year data of the top 500 stocks in the stock pool, and the test results are as follows:
For loop: 55 seconds
8 threads: 7.5 seconds
8 processes: 7.8 seconds
Multi-tasking for crawlers
For crawlers, is it suitable for multi-threading or multi-process?
Crawler is based on the network request module URllib. Urllib3 plays the role of an HTTP client, sending an HTTP request to the network server and then waiting for the response from the network server. This type of task is I/O intensive. Unlike computationally intensive tasks, which consume CPU resources throughout the time slice, I/O intensive tasks spend most of their time waiting for I/O operations to complete.
Next, we will use crawler to obtain daily real-time market data of Oriental Fortune net stock as a scene to expand the introduction of multi-threaded speed scheme.
For the realization process of crawler, please refer to the following topics of knowledge Planet:
We see that there are 206 pages on the page, so instead of just one thread reading each page, we can assign the task to multiple threads.
Python3 has a built-in thread pool module, ThreadPoolExecutor, which implements multi-threaded processing.
For crawler tasks, each page is only different from the URL address. Therefore, according to the requirements of the module, the crawler task crawer_daily() function is divided into two parts: map_fun() and iterable parameter ITr_arg.
The key codes are as follows:
with ThreadPoolExecutor(max_workers=8) as executor:
# map_fun The map function passed in to be executed
# itr_argn Iterable arguments
# result returns a generator
results = executor.map(crawer_daily, itr_arg)
Copy the code
There are only 20 stocks per page, so we need to merge the data into a DataFrame and save it as a local CSV file.
The key codes are as follows:
for ret in results:
df_daily_stock = df_daily_stock.append(ret, ignore_index=True)
df_daily_stock.to_csv("crawer_daily_stock/{}.csv".format(df_daily_stock["Time"].values[0]), columns=df_daily_stock.columns, index=True, encoding='GBK')
Copy the code
Open the CSV file as follows:
Note that I added the “time” column. Because I crawled it at the closing price, the time display is 2020-08-21 15:00:00. If the real-time data is obtained in the intraday, the corresponding time will reflect the time stamp of data update.
Another important point is the name of the file, here I choose the name is “2020-08-21 15/00/00.csv”, if the real-time data should reflect the hour/minute/second information.
For the results of the test, I took 8 threads and the execution time was just over 6 seconds. In other words, if we only update daily data increments, it only takes 6 seconds to update all stock data of the a-share market every day.
As the test environment varies, the test results are for your reference only. You can also compare the efficiency of multi-threading and multi-processing.
conclusion
Through this simple and practical stock quantification scene, I hope to give the majority of friends for quantitative trading have an intuitive feeling.
Next, we should upgrade oneself fry way, fry oneself before that set of method, abstract into strategy model, go to the whole market with quantitative method to measure evaluation, then let the program help us monitor the trend of the market.
This just is the quantization that common share people place suits trades open means!
— — — — — — — — — — — — — — — — — — — — — — — —If you want to introduce the knowledge points involved in a more comprehensive and systematic way from 0-1, here I recommend my book to you"Quantitative Trading in Python Stocks from Basics to Practice"
! Tmall, JINGdong, Dangdang full sale!
You are also welcome to follow my wechat official account to learn more about Python quantitative trading
Routines in the code, upload to “play turn stock quantitative trading” knowledge planet (knowledge planet catalog can click [read] view).