Choose a logical
The stock market is risky and investment should be cautious.
Investment direction: follow the fund institutions for stock selection, simply put, is the priority screening of those fund companies hold heavy positions of stocks.
Goal setting
Obtain the information of the top ten holding stocks of all funds, such as the number of shares, market value and so on, and analyze the ranking of the fund’s heavy holding stocks.
Crawl all fund code information
Note: Use Python’s built-in SQLite database
Create a table to store fund code information
Create TABLE if not EXISTS TARGETS_ALL (UID string, name string, ID string, category string, Tag String, type string, version datetime not null default (datetime('now', 'localtime')) );Copy the code
Reference: juejin. Cn/post / 699352…
Obtain the corresponding position information according to the fund code
Create a table to store position information
Create table if not exists top10_stockS_in_fund (uid string, fund_id string, rank_num INTEGER, stock_id string, stock_name string, hold_rate float, hold_count float, hold_value float, update_time string )Copy the code
Crawl and store to database
# coding:utf-8 import logging from collections import defaultdict from bs4 import BeautifulSoup as BS import requests import uuid from common.constant import CONN_KEY from util import db_util from common import global_var as gv Def _parse_content(content): def _parse_content(content): result = content.replace("var apidata={", "") result = result[:-2] content = result.split("",arryear:")[0].replace("content:"", "") content = BS(content).find_all("tr") result = defaultdict(list) for tr in content: date = tr.parent.parent.previous_sibling.previous_sibling.find("font").text td = tr.find_all("td") if len(td) == 9: result[date].append({ "rank_num": td[0].text, "stock_id": td[1].text, "stock_name": td[2].text, "hold_rate": float(td[6].text.replace("%", ""))/100, "hold_count": td[7].text, "hold_value": eval(td[8].text.replace(",", "")) }) elif len(td) == 7: result[date].append({ "rank_num": td[0].text, "stock_id": td[1].text, "stock_name": td[2].text, "hold_rate": float(td[4].text.replace("%", ""))/100, "hold_count": td[5].text, "hold_value": Def get_top10_stocks_by_fund(fund_code) eval(td[6].text.replace(",", "))}) return result base_url = "https://fundf10.eastmoney.com/FundArchivesDatas.aspx?type=jjcc&code={fund_code}&topline=10" url = base_url.format(fund_code=fund_code) response = str(requests.get(url).content, "Utf-8 ") return _parse_content(response) # def _is_exist(conn, fund_id, stock_id, update_time) query_sql = f"select 1 from top10_stocks_in_fund " \ f"where fund_id = '{fund_id}'" \ f"and stock_id = '{stock_id}'" \ f"and update_time = '{update_time}'" res = conn.query(query_sql) if res.fetchone(): Def update_top10_stocks_by_fund(): Init_conn_pool () conn = gv.get_value(CONN_KEY) query_SQL = "select * from targets_all where type = Target = conn. Query (query_SQL) # Record the number of successful and failed updates success_count = 0 failure_count = 0 for (uid, fund_name, fund_id, category, tag, type, version) in targets: Fund_id = STR (fund_id).zfill(6) logging.info(f" asking for {fund_name}-{fund_id} holding information..." ) stocks_in_fund = get_top10_stocks_by_fund(fund_id) If _is_exist(conn, fund_id, stock["stock_id"], update_time) if _is_exist(conn, fund_id, stock["stock_id"], update_time) Insert_sql = f"insert into top10_stocks_in_fund" \ f"(uid,fund_id,rank_num,stock_id,stock_name,hold_rate,hold_count,hold_value,update_time)" \ f"values('{uuid.uuid4()}','{fund_id}',{stock['rank_num']}," \ f"'{stock['stock_id']}','{stock['stock_name']}',{stock['hold_rate']}," \ F "{stock['hold_count']},{stock['hold_value']},'{update_time}')" if conn.operate(insert_SQL): logging. ) success_count = success_count+1 else: Warning (f" failed to insert: {insert_SQL}") failure_count = failure_count+1 {success_count} items, failure: {failure_count} items." ) if __name__ == "__main__": update_top10_stocks_by_fund()Copy the code
Analysis of the first/second quarter of the fund ranking of heavy positions
Select stock_id (stock_id), stock_name (stock_id), stock_name (stock_id), Cast (sum(hold_value) as double) from top10_stocks_in_fund WHERE update_time = '2021-06-30' group by 1, 2 order by 3 descCopy the code
Select stock_id (stock_id), stock_name (stock_id), stock_name (stock_id), Cast (sum(hold_count) as double) FROM top10_stocks_in_fund WHERE update_time = '2021-06-30' group by 1, 2 order by 3 descCopy the code
conclusion
Financial market prefers leading stocks in energy consumption, banks and securities brokerages, etc. An in-depth study on the changes in the number of shares held and market value can provide a meaningful reference for increasing positions and exchanging stocks. (Note: Most fund position information is updated at the end of each quarter, the longer the distance, the less reference significance)