Recently, I saw her answer some of the most perplexing energy questions on blank Woman, such as why can I do so many things at once and have so much energy? Follow one truth at work:
Simplify complex things, standardize simple things, streamline standard things, streamline things as tools, automate instrumental things, and outsource things that can’t be automated.
In the process of data development, some processes can be instrumentalized to improve work efficiency and free up more time and energy to improve themselves.
tools
In the daily data development process, we often need to write table building sentences according to the data model. It takes several minutes to write table building sentences every time, and it is easy to make some low-level mistakes. Therefore, WE plan to make an Excel template, write table fields, table partitions and table names in it, and generate table building sentences automatically through the program. The effect is shown below:
To make the template
Table name, Chinese description of the table, field name, data type, field description, is not null, unique primary key, table partition, etc. (can be adjusted according to the actual situation)
The template consists of three parts:
- Lines 1-3 are the table name, the Chinese description of the table, and the template column description
- 4-19 Basic information about the fields in the behavior creation table
- 20 Behavior partition field
Realize the principle of
Obtain the data model file in the specified directory. The convention is an Excel file ending with the data model.xls or data model.xlsx file name
Loop through each template file, according to the template composition specification, parse the file, splicing SQL statements, generate table construction sentence file
Generate results
CREATE EXTERNAL TABLE ods_cbonddescription (
object_id string COMMENT 'object ID',
b_info_fullname string COMMENT 'Name of Bond',
s_info_name string COMMENT 'Bond for short',
b_info_issuer string COMMENT 'Issuer',
b_issue_announcement string COMMENT 'Date of Publication',
b_issue_firstissue string COMMENT 'Date of Issue',
b_issue_lastissue string COMMENT 'Issue End date',
b_issue_amountplan bigint COMMENT 'Total Planned Issuance (100 million Yuan)',
b_issue_amountact bigint COMMENT 'Total Actual Issuance (100 million Yuan)',
b_info_issueprice bigint COMMENT 'Issue price',
b_info_par bigint COMMENT 'value',
b_info_term_year int COMMENT 'Bond Term (year)',
b_info_term_day int COMMENT 'Bond Maturity (days)',
b_info_paymentdate int COMMENT 'Date of payment',
b_info_paymenttype int COMMENT 'Method of interest',
s_info_exchmarket string COMMENT 'Exchange'
)
COMMENT 'Basic Bond Information'
PARTITIONED BY(dt string)
ROW FORMAT DELIMITED '\t'
STORED AS ORC
LOCATION 'hdfs://host:8020/dw/ods/ods_cbonddescription';
Copy the code
conclusion
If we use what kind of template in our work, we can communicate with each other how to form a certain tool script to improve our work efficiency.
Finally, follow the public account HelloTech and reply “602” to get the template and code.