Python Quantitative Data Warehouse Building Series 2: Python operational databases
This series of tutorials for quantitative developers, to provide local quantitative financial data warehouse construction tutorials and full set of source code. We use the UDATA financial data community as the data source to dump the financial base data into a local database. Tutorial provides a full set of source code, including historical data download and incremental data update, data update task deployment and daily monitoring operations.Copy the code
In the previous section, we chose MySQL as the database for this series of tutorials, so this article focuses on the steps Python takes to manipulate MySQL and how to wrap it. At the end of the article, a brief introduction to Python operations MongoDB, SQLite, PostgreSQL database;
Pymysql
1. Install the Pymysql module
pip install pymysql
Copy the code
2. Connect to the database
from pymysql import *
Database parameters can be viewed in the MySQL interface or in the database configuration file
conn = pymysql.connect(host = 'Database IP',
port = 'port',
user = 'Username',
password = 'password',
database='Database name')
Create a cursor object cursor using the cursor() method
cursor = conn.cursor()
Close the database connection after the database operation is complete
# conn.close()
Copy the code
3. Common SQL code execution
from pymysql import *
SQL > create table, delete table, insert data
def Execute_Code(sql_str) :
Open a database connection
conn = pymysql.connect(host = '127.0.0.1',port = 3306,user = 'root',
password = 'password',database='udata')
Create a cursor object cursor using the cursor() method
cursor = conn.cursor()
try:
Use the execute() method to execute SQL
cursor.execute(sql)
Commit to database for execution
conn.commit()
except:
Rollback if an error occurs
conn.rollback()
Close the database connection
conn.close()
Copy the code
, built A table
sql_str = '''CREATE TABLE TB_Stock_List_Test ( secu_code CHAR(20), hs_code CHAR(20), secu_abbr CHAR(20), chi_name CHAR(40), secu_market CHAR(20), listed_state CHAR(20), listed_sector CHAR(20), updatetime CHAR(20)); ' ' '
Execute_Code(sql_str)
Copy the code
B. Insert data
sql_str = ''' INSERT INTO TB_Stock_List_Test (`secu_code`,`hs_code`,`secu_abbr`,`chi_name`,`secu_market`,`listed_state` ,' listed_sector ',' updatetime 'VALUES ('000001',' 00000.sz ','Ping An Bank Co., Ltd.', 'main board ','2021-10-25 20:10:55'); ' ' '
Execute_Code(sql_str)
Copy the code
C. Update data
sql_str = "UPDATE tb_stock_list SET updatetime = '2021-10-30 20:10:55' "
Execute_Code(sql_str)
Copy the code
D. Delete data
sql_str = 'DELETE FROM tb_stock_list'
Execute_Code(sql_str)
Copy the code
E. Delete the table
sql_str = 'DROP TABLE IF EXISTS tb_stock_list'
Execute_Code(sql_str)
Copy the code
4. Query operations
def Select_Code(sql_str) :
Open a database connection
conn = pymysql.connect(host = '127.0.0.1',port = 3306,user = 'root',
password = 'password',database='udata')
Create a cursor object cursor using the cursor() method
cursor = conn.cursor()
Use the execute() method to execute SQL
cursor.execute(sql_str)
Get a list of all records
results = cursor.fetchall()
Close the database connection
conn.close()
return results
Copy the code
sql_str = 'select * from tb_stock_list'
results = Select_Code(sql_str)
results
Copy the code
5. Method encapsulation
Encapsulate the above usage as a custom class and save it as MySQLOperation. Py with the following code:
from pymysql import *
MySQL > select * from 'MySQL'
class MySQLOperation:
def __init__(self, host, port, db, user, passwd, charset='utf8') :
Parameter initialization
self.host = host
self.port = port
self.db = db
self.user = user
self.passwd = passwd
self.charset = charset
def open(self) :
Open a database connection
self.conn = connect(host=self.host,port=self.port
,user=self.user,passwd=self.passwd
,db=self.db,charset=self.charset)
Create a cursor object cursor using the cursor() method
self.cursor = self.conn.cursor()
def close(self) :
Disconnect the database
self.cursor.close()
self.conn.close()
def Execute_Code(self, sql) :
SQL > create table, delete table, insert data
try:
self.open(a)Open a database connection
self.cursor.execute(sql) Use the execute() method to execute SQL
self.conn.commit() Commit to database for execution
self.close() Disconnect the database
except Exception as e:
self.conn.rollback() Rollback if an error occurs
self.close() Disconnect the database
print(e)
def Select_Code(self, sql) :
# execute SQL code to query data
try:
self.open(a)Open a database connection
self.cursor.execute(sql) Use the execute() method to execute SQL
result = self.cursor.fetchall() Get a list of all records
self.close() Disconnect the database
return result Return query data
except Exception as e:
self.conn.rollback() Rollback if an error occurs
self.close() Disconnect the database
print(e)
Copy the code
Insert and query are used as follows, and other uses are similar and will not be described here.
import pandas as pd
host='127.0.0.1'
port=3306
user='root'
passwd="Password"
db='udata'
# method instantiation
MySQL = MySQLOperation(host, port, db, user, passwd)
Insert operation code
sql_str = ''' INSERT INTO tb_stock_list (`secu_code`,`hs_code`,`secu_abbr`,`chi_name`,`secu_market`,`listed_state`,`listed_sector`,`updatetime`) VALUES (' 000001 ', '000001. SZ', 'pingan Bank', 'Ping An Bank Co., Ltd.', 'the shenzhen stock exchange', 'public', 'main', '2021-10-25 20:15:55'); ' ' '
MySQL.Execute_Code(sql_str)
# query data
sql_str = 'select * from tb_stock_list'
results = MySQL.Select_Code(sql_str)
results
Copy the code
Sqlalchemy
Since the pymysql usage above already meets most of your usage requirements, the SQLAlchemy implementation functions similarly. Here we’ll focus on the pandas. To_sql and pandas. Read_sql operations based on SQLAlchemy to link to the database.
1. Install the Pymysql module
pip install sqlalchemy
Copy the code
2. Connect to the database
from sqlalchemy import create_engine
host='127.0.0.1'
port = 3306
user='root'
password='password'
database='udata'
engine = create_engine('mysql://{0}:{1}@{2}:{3}/{4}? charset=utf8'.format(user
,password
,host
,port
,database))
Copy the code
3, pandas. To_sql
Write data from DataFrame to MySQL database as follows:
import pandas as pd
Define the data to write in DataFrame format
data = pd.DataFrame([['000001'.'000001.SZ'.Ping an Bank.'Ping An Bank Co., Ltd.'
,Shenzhen Stock Exchange.'listed'.'main'.'the 2021-10-25 20:12:55'],
['000002'.'000002.SZ'.A ' 'all families.'China Vanke Co., Ltd.
,Shenzhen Stock Exchange.'listed'.'main'.'the 2021-10-25 20:12:55']])
# Column name assignment
data.columns = ['secu_code'.'hs_code'.'secu_abbr'.'chi_name'
, 'secu_market'.'listed_state'.'listed_sector'.'updatetime']
Write to database
data.to_sql(name='tb_stock_list', con=engine, index=False, if_exists='append')
Copy the code
The if_exists parameter is used to process a target table if the target table already exists. The default value is FAIL. The other two options are replace, which means to replace the original table by deleting it and then creating it, and append, which simply adds data.
4, pandas. Read_sql
From the database, read data as a DataFrame as shown in the following code example:
Select result from SQL query
result = pd.read_sql('''SELECT * FROM tb_stock_list ''', con=engine)
result
Copy the code
Python operates on other common databases
1, the mongo
PIP install Pymongo
(2) Operation brief
import pymongo
# connect mongo
conn = pymongo.MongoClient(host='localhost',port=27017
,username='username', password='password')
Select * from database;
db = conn['udata'] # db = client.udata
# set set
collection = db['tb_stock_list'] # collection = db.tb_stock_list
Insert data insert_one(), insert_many()
data1 = {} # set, key-value pair, 1 piece of data
data2 = {} # set, key-value pair, 1 piece of data
result = collection.insert_many([data1, data2])
# result = collection.insert_one(data1)
Find_one (), find()
result = collection.find_one({'secu_code': '000001'})
Update_one (), update()
result = collection.update_one({'secu_code': '000001'}, {'$set': {'hs_code': '000001'}})
Remove (), delete_one(), delete_many()
result = collection.remove({'secu_code': '000001'})
Copy the code
2, SQLite
PIP install sqlite3
(2) Operation brief
import sqlite3
Connect to database
conn = sqlite3.connect('udata.db')
Create a cursor
cursor = conn.cursor()
# execute SQL
sql = "Add, subtract, delete, etc SQL code"
cursor.execute(sql)
# query data
sql = "Query SQL code"
values = cursor.execute(sql)
# Submit things
conn.commit()
Close the cursor
cursor.close()
# close the connection
conn.close()
Copy the code
3, PostgreSQL
Install psycopg2: PIP install psycopg2
(2) Operation brief
import psycopg2
Connect to database
conn = psycopg2.connect(database="udata", user="postgres"
, password="Password", host="127.0.0.1", port="5432")
Create a cursor
cursor = conn.cursor()
# execute SQL
sql = "Add, subtract, delete, etc SQL code"
cursor.execute(sql)
Select * from * where * from *
sql = "Query SQL code"
cursor.execute(sql)
rows = cursor.fetchall()
# transaction submission
conn.commit()
Close the database connection
conn.close()
Copy the code
So that concludes our brief introduction to Python operating databases; There are many other types of databases, and Python operates on them in much the same way, as I’ll continue to comb through the material.
The next section is “Building a Python Quantitative Investment Data warehouse 3: Data warehousing Code Encapsulation”.