Python Quantitative Data Warehouse Building Series 2: Python operational databases

This series of tutorials for quantitative developers, to provide local quantitative financial data warehouse construction tutorials and full set of source code. We use the UDATA financial data community as the data source to dump the financial base data into a local database. Tutorial provides a full set of source code, including historical data download and incremental data update, data update task deployment and daily monitoring operations.Copy the code

In the previous section, we chose MySQL as the database for this series of tutorials, so this article focuses on the steps Python takes to manipulate MySQL and how to wrap it. At the end of the article, a brief introduction to Python operations MongoDB, SQLite, PostgreSQL database;

Pymysql

1. Install the Pymysql module

pip install pymysql
Copy the code

2. Connect to the database

from pymysql import * 
Database parameters can be viewed in the MySQL interface or in the database configuration file
conn = pymysql.connect(host = 'Database IP',
                       port = 'port',
                       user = 'Username',
                       password = 'password',
                       database='Database name')

Create a cursor object cursor using the cursor() method
cursor = conn.cursor()

Close the database connection after the database operation is complete
# conn.close()
Copy the code

3. Common SQL code execution

from pymysql import * 
SQL > create table, delete table, insert data
def Execute_Code(sql_str) :
    Open a database connection
	conn = pymysql.connect(host = '127.0.0.1',port = 3306,user = 'root',
                           password = 'password',database='udata')
    Create a cursor object cursor using the cursor() method
    cursor = conn.cursor()
    try:
        Use the execute() method to execute SQL
        cursor.execute(sql)
        Commit to database for execution
        conn.commit()
	except:
        Rollback if an error occurs
        conn.rollback()
    Close the database connection
    conn.close() 
Copy the code

, built A table

sql_str = '''CREATE TABLE TB_Stock_List_Test ( secu_code CHAR(20), hs_code CHAR(20), secu_abbr CHAR(20), chi_name CHAR(40), secu_market CHAR(20), listed_state CHAR(20), listed_sector CHAR(20), updatetime CHAR(20)); ' ' '
Execute_Code(sql_str)
Copy the code

B. Insert data

sql_str = ''' INSERT INTO TB_Stock_List_Test (`secu_code`,`hs_code`,`secu_abbr`,`chi_name`,`secu_market`,`listed_state` ,' listed_sector ',' updatetime 'VALUES ('000001',' 00000.sz ','Ping An Bank Co., Ltd.', 'main board ','2021-10-25 20:10:55'); ' ' '
Execute_Code(sql_str)
Copy the code

C. Update data

sql_str = "UPDATE tb_stock_list SET updatetime = '2021-10-30 20:10:55' "
Execute_Code(sql_str)
Copy the code

D. Delete data

sql_str = 'DELETE FROM tb_stock_list'
Execute_Code(sql_str)
Copy the code

E. Delete the table

sql_str = 'DROP TABLE IF EXISTS tb_stock_list'
Execute_Code(sql_str)
Copy the code

4. Query operations

def Select_Code(sql_str) :
    Open a database connection
	conn = pymysql.connect(host = '127.0.0.1',port = 3306,user = 'root',
                           password = 'password',database='udata')
    Create a cursor object cursor using the cursor() method
    cursor = conn.cursor()
    Use the execute() method to execute SQL
    cursor.execute(sql_str)
    Get a list of all records
    results = cursor.fetchall()
    Close the database connection
    conn.close()
    
    return results
Copy the code
sql_str = 'select * from tb_stock_list'
results = Select_Code(sql_str)
results
Copy the code

5. Method encapsulation

Encapsulate the above usage as a custom class and save it as MySQLOperation. Py with the following code:

from pymysql import * 
MySQL > select * from 'MySQL'
class MySQLOperation:
    def __init__(self, host, port, db, user, passwd, charset='utf8') :
        Parameter initialization
        self.host = host
        self.port = port
        self.db = db
        self.user = user
        self.passwd = passwd
        self.charset = charset
    def open(self) :
        Open a database connection
        self.conn = connect(host=self.host,port=self.port
                            ,user=self.user,passwd=self.passwd
                            ,db=self.db,charset=self.charset)
        Create a cursor object cursor using the cursor() method
        self.cursor = self.conn.cursor()
    def close(self) :
        Disconnect the database
        self.cursor.close()
        self.conn.close()
    def Execute_Code(self, sql) :
        SQL > create table, delete table, insert data
        try:
            self.open(a)Open a database connection
            self.cursor.execute(sql)  Use the execute() method to execute SQL
            self.conn.commit()        Commit to database for execution
            self.close()              Disconnect the database
        except Exception as e:
            self.conn.rollback()      Rollback if an error occurs
            self.close()              Disconnect the database
            print(e)
    def Select_Code(self, sql) :
        # execute SQL code to query data
        try:
            self.open(a)Open a database connection
            self.cursor.execute(sql)           Use the execute() method to execute SQL
            result = self.cursor.fetchall()    Get a list of all records
            self.close()                       Disconnect the database
            return result                      Return query data
        except Exception as e:
            self.conn.rollback()               Rollback if an error occurs
            self.close()                       Disconnect the database
            print(e)
Copy the code

Insert and query are used as follows, and other uses are similar and will not be described here.

import pandas as pd
host='127.0.0.1'
port=3306
user='root'
passwd="Password"
db='udata'
# method instantiation
MySQL = MySQLOperation(host, port, db, user, passwd)
Insert operation code
sql_str = ''' INSERT INTO tb_stock_list (`secu_code`,`hs_code`,`secu_abbr`,`chi_name`,`secu_market`,`listed_state`,`listed_sector`,`updatetime`) VALUES (' 000001 ', '000001. SZ', 'pingan Bank', 'Ping An Bank Co., Ltd.', 'the shenzhen stock exchange', 'public', 'main', '2021-10-25 20:15:55'); ' ' '
MySQL.Execute_Code(sql_str)
# query data
sql_str = 'select * from tb_stock_list'
results = MySQL.Select_Code(sql_str)
results
Copy the code

Sqlalchemy

Since the pymysql usage above already meets most of your usage requirements, the SQLAlchemy implementation functions similarly. Here we’ll focus on the pandas. To_sql and pandas. Read_sql operations based on SQLAlchemy to link to the database.

1. Install the Pymysql module

pip install sqlalchemy
Copy the code

2. Connect to the database

from sqlalchemy import create_engine
host='127.0.0.1'
port = 3306
user='root'
password='password'
database='udata'
engine = create_engine('mysql://{0}:{1}@{2}:{3}/{4}? charset=utf8'.format(user
                                                                         ,password
                                                                         ,host
                                                                         ,port
                                                                         ,database))
Copy the code

3, pandas. To_sql

Write data from DataFrame to MySQL database as follows:

import pandas as pd
Define the data to write in DataFrame format
data = pd.DataFrame([['000001'.'000001.SZ'.Ping an Bank.'Ping An Bank Co., Ltd.'
                      ,Shenzhen Stock Exchange.'listed'.'main'.'the 2021-10-25 20:12:55'],
                   ['000002'.'000002.SZ'.A ' 'all families.'China Vanke Co., Ltd.
                    ,Shenzhen Stock Exchange.'listed'.'main'.'the 2021-10-25 20:12:55']])
# Column name assignment
data.columns = ['secu_code'.'hs_code'.'secu_abbr'.'chi_name'
                , 'secu_market'.'listed_state'.'listed_sector'.'updatetime']
Write to database
data.to_sql(name='tb_stock_list', con=engine, index=False, if_exists='append')
Copy the code

The if_exists parameter is used to process a target table if the target table already exists. The default value is FAIL. The other two options are replace, which means to replace the original table by deleting it and then creating it, and append, which simply adds data.

4, pandas. Read_sql

From the database, read data as a DataFrame as shown in the following code example:

Select result from SQL query
result = pd.read_sql('''SELECT * FROM tb_stock_list ''', con=engine)
result
Copy the code

Python operates on other common databases

1, the mongo

PIP install Pymongo

(2) Operation brief

import pymongo
# connect mongo
conn = pymongo.MongoClient(host='localhost',port=27017
                           ,username='username', password='password')
Select * from database;
db = conn['udata']  # db = client.udata
# set set
collection = db['tb_stock_list']  # collection = db.tb_stock_list
Insert data insert_one(), insert_many()
data1 = {}  # set, key-value pair, 1 piece of data
data2 = {}  # set, key-value pair, 1 piece of data
result = collection.insert_many([data1, data2])
# result = collection.insert_one(data1)
Find_one (), find()
result = collection.find_one({'secu_code': '000001'})
Update_one (), update()
result = collection.update_one({'secu_code': '000001'}, {'$set': {'hs_code': '000001'}})
Remove (), delete_one(), delete_many()
result = collection.remove({'secu_code': '000001'})
Copy the code

2, SQLite

PIP install sqlite3

(2) Operation brief

import sqlite3
Connect to database
conn = sqlite3.connect('udata.db')
Create a cursor
cursor = conn.cursor()
# execute SQL
sql = "Add, subtract, delete, etc SQL code"
cursor.execute(sql)
# query data
sql = "Query SQL code"
values = cursor.execute(sql)
# Submit things
conn.commit()
Close the cursor
cursor.close()
# close the connection
conn.close()
Copy the code

3, PostgreSQL

Install psycopg2: PIP install psycopg2

(2) Operation brief

import psycopg2
Connect to database
conn = psycopg2.connect(database="udata", user="postgres"
                        , password="Password", host="127.0.0.1", port="5432")
Create a cursor
cursor = conn.cursor()
# execute SQL
sql = "Add, subtract, delete, etc SQL code"
cursor.execute(sql)
Select * from * where * from *
sql = "Query SQL code"
cursor.execute(sql)
rows = cursor.fetchall()
# transaction submission
conn.commit()
Close the database connection
conn.close()
Copy the code

So that concludes our brief introduction to Python operating databases; There are many other types of databases, and Python operates on them in much the same way, as I’ll continue to comb through the material.

The next section is “Building a Python Quantitative Investment Data warehouse 3: Data warehousing Code Encapsulation”.