1 overview
I don’t get lost, share a little bit every day, progress a little bit every day ~
1.1 Organization of data
Software engineering books used to equate software with code plus data. This shows that code and data are the elements of software. Data is a static snapshot of a recording object at a point in time, such as a form; The code that implements the business is the invisible hand that manipulates and changes the data.
Data can be divided into file types, such as XML, JSON, and Cookie. Database type, such as SQL Server, Oracle, Sqlite, etc. Memory type, such as Session Session.
According to the organization form, it can be divided into: hierarchical model, which adopts tree structure to represent the relationship between data and data. One node represents a record, and other nodes except the root node have only one parent node. The mesh model, which uses the mesh structure to represent the relationship between data and data, allows more than one node to have no parents, and one node can have multiple parents; A relational model that uses table structures to express sets of entities and the relationships between sets of entities. The biggest difference between the relational model and the previous two models is that the main code is used instead of pointer navigation data.
It’s not hard to see that no matter how the data is organized, different data models focus on two things: data and the relationships between data.
1.2 Case Analysis
In order to effectively protect cultivated land, Chengdu Land and Resources Bureau decided to establish the basic database of cultivated land protection in Chengdu, and establish household and cultivated land files to strengthen the management of farmland and subsidy fund.
The data to be managed include agricultural household, household members and cultivated land. Agricultural household information includes household number, household type, household name, household id number and administrative division to which the household belongs (i.e. ownership code). The information of family members includes name, ID number, gender, date of birth, age and whether they are land obligor or not. Cultivated land information includes plot number (ownership code plus 4-digit self-coding), cultivated land type, map number, map spot number, contract area, measured area, directly supplemented area, and year of registration.
Each household contains one or more family members, and one family member belongs to only one family. Each household owns one or more plots of farmland, and a certain plot of farmland belongs to one or more families.
2 Database Design
2.1 Design Content
Database design belongs to the overall design and detailed design of a system.
The core of the overall design stage is to form a conceptual model, which mainly includes: sorting and analysis of existing data, database logical structure, naming rules, e-R graph. In the detailed design stage is the specific refinement of the overall design, mainly including: database physical structure and storage scheme, table description and table attribute list, etc.
Step 2.2
PowerDesigner provides multiple design routes and model conversion (as shown in Figure 2-1) for designers to freely play. There are two common processes:
Figure 2-1 Relationship between CDM, PDM and OOM
L Design CDM, convert to PDM, and finally generate OOM.
L Design CDM, convert to PDM and OOM, and refine them respectively.
This paper adopts the first process to carry out database design practice.
3. Conceptual data model design
3.1 Finding Entities
First, find out the nouns in the business domain described in the user requirements. If the information or state of them needs to be stored in the system, they may be the business entity objects we are looking for.
According to the description in Section 1.2, it can be roughly determined that the system has at least three entities: agricultural household, household members and cultivated land.
Table 3-1 Entities included in the system
The entity name | meaning |
---|---|
Agricultural Family | Store agricultural household information for querying household members and arable land by household |
Family Members | Store family member information |
Arable Land | Store arable land information |
3.2 Clear connections
After finding the entity, according to the requirement description and business understanding, the connection between the entities is represented by a diamond, the connection name is written in the diamond box, and the related entities are connected by an undirected edge. Meanwhile, the type of the connection is marked “one-to-one (1, 1), one-to-many (1, n), many-to-many (m, N)” beside the undirected edge.
Figure 3-1 Interentity relationships
3.3 Forming an E-R diagram
With entities and associations drawn, the skeleton of the E-R diagram is formed. Now we append the attributes of the system that concern entities to each entity. Because the E-R diagram is a product of the conceptual design phase, we don’t need to focus too much on the details of the database and programming implementation, lest we lose focus.
For example, whether entity serial number (Id) is required in attributes, the separation of compound attributes (such as block number), the processing of derived attributes (such as date of birth and gender can be obtained from Id number, age can be obtained from date of birth), and compliance with paradigm can be put in the detailed design stage.
What the conceptual design stage should do is to find the main core entities of the system and their interrelationships, and mark out the attributes that the entities need to use in the system.
Figure 3-2 E-R diagram of cultivated land protection system
3.4 Converting e-R graph to CDM
During the conceptual design phase, the E-R diagram is used more as an important “sketch”. E-r diagram is needed in conceptual design, but it is often replaced by conceptual data model. I think it may be because the latter completely inherits all elements and essence of e-R graph, and can describe attributes more succinctly, and use the results of the previous stage in detailed design and database generation.
To sum up, e-R diagram thinking mode can be adopted to draw conceptual data model directly in conceptual design. The corresponding relationship between e-R diagram and CDM concepts is as follows:
Table 3-2 Corresponding relationship between E-R diagram and CDM related concepts
* * * * E – R diagram | Conceptual data Model (CDM)* * * *) |
---|---|
entity | Entity |
contact | Relationship |
attribute | Attributes |
— | Association |
specialization | (C) C) C) c) |
— | Dependencies (Link/Extended Dependency) |
When designing the CDM, it is important to note that it differs from the E-R diagram in some detail. Attributes have a data type, length, and whether it is Mandatory. Second, it is better that all entities have a Primary Indentifier, otherwise reference relationships (foreign keys) cannot be created automatically when converted to PDM. Third, “one to many” connection may exist in the dependency (calibration connection).
The data type is set for the attribute, and the attribute that uniquely identifies the entity is set as the primary key, resulting in the CDM as follows. Household number and plot number are the primary keys of agricultural household and arable land entity respectively. Although the id card number can uniquely identify family members, it was created as the primary key to consider the possibility of changing it due to a typing error.
Figure 3-3 CDM of cultivated land protection system
Physical data model transformation
4.1 CDM generates PDM
When the database is selected, the existing CDM can be converted to the PDM corresponding to the database, and then the SQL script can be generated by PDM to establish the database storage structure.
In the stage of detailed design, data storage needs to be considered in combination with the way of system function realization, and the focus of database design turns to the physical realization of data model. For example, the establishment of auxiliary supporting tables, the processing of derived attributes, the establishment of indexes and so on.
Open the CDM Model, select ToolsaGenerate Physical Data Model, select the target DBMS and click OK to generate PDM. The corresponding relationship in transformation is as follows:
Table 4-1 Corresponding relationship between CDM and PDM objects
CDM object in the * * * * | Product data management (PDM) object in the * * * * |
---|---|
Entity | Table (Table) |
Relationship | Reference relation, foreign key |
Attributes | Column (Column) |
Primary Identifier | Primary Key |
PDM is generated as shown in the following figure. Since there is a “many-to-many” relationship between agricultural households and cultivated land, the system automatically generates a new table to record the relationship. The relationship between agricultural household and family members is one-to-many, and the system automatically adds foreign keys to the family member table.
Figure 4-1 CULTIVATED land protection system PDM(1)
4.2 Specific problems are analyzed
1) Redundancy field selection
Although the field setting of agricultural household table meets the requirements of presenting data in the system, if the household head changes, we need to change the new household head in the family member table to be the land obligatory person and modify the household head information in the agricultural household table. If the fields of the name and ID number of the head of the household are removed from the agricultural household table, the household number should be associated with the family member table when querying by household, and the name and ID number of the head of the household should be extracted from the records of the land obligor.
Considering that the system is dominated by query operation and the operation frequency of editing household head information is very low, the agricultural household table retains the fields of household head name and ID number.
2) Processing of derived properties (transitive dependencies)
In the family member table, gender and date of birth fields depend on id number fields, and age fields partly depend on date of birth fields. Whether to remove gender, date of birth and age fields, and provide algorithms to extract gender, date of birth and age in real time in the system code implementation?
After a little analysis, it can be seen that gender and date of birth do not change with time, so it can be extracted from the ID card number at the time of entry, which is not only necessary for checking the id card number format, but also improves the efficiency of the system when reading records. Age is the difference between today and the date of birth, which varies with time. Real-time calculation is the most reasonable when displaying, so delete the age field.
3) Processing of compound attributes
In the cultivated land table, the plot number of the primary key is composed of the ownership code and the 4-bit self-coding. The ownership code is the administrative division code, composed of districts, counties, towns, village neighborhood committees, groups of 18 codes. Four sequential codes were coded from 0001 in each group, and the combination of the two codes ensured the unique plot number.
The system needs to realize the classification and summary according to the ownership code, such as by group, by village neighborhood committee and so on. This can be achieved by running the string prefix matching of the block number field (LIKE ‘510108%’), that is, no separate ownership code field is added.
4) Auxiliary support table establishment
It can be seen from the previous paper that farmland and agricultural households are managed hierarchically according to the ownership code, but the storage of ownership code is not considered in the current database design. In the system, it often presents in a tree structure and reads records recursively, so it is necessary to set up fields such as parent node serial number in the ownership code table.
In addition, for the convenience of physical storage, the field values of household type, cultivated land type and gender are all represented by one digit. However, when presented in the system, it needs to be translated into the corresponding Chinese meaning, such as male and female, family, collective and so on. Because the enumeration items of gender and household Type are fixed, it can be considered to be hard-coded into the code as Enum Type. Enumerations of arable land types are subject to change and can be built in a database or managed in AN XML file.
5) other
If the land owned by a household decreases, it is necessary to delete the corresponding record in the relation table between household and cultivated land. To ensure that the operation is reversible, the system often sets up the Status field as a logical deletion marker.
Figure 4-2 CULTIVATED land Protection system PDM(2)
4.3 Generating SQL Library Building Scripts
Open the PDM model and select the DatabaseaGenerate Database command to generate SQL scripts or directly connect to the DBMS Database.
5. Object-oriented model generation
Object-oriented Model (OOM) is a UML description model closely related to programming language. It can generate class diagram by CDM and PDM, and generate target language code by OOM.
This part is beyond the scope of database design discussion and will not be covered here.
6 summary
6.1 Relationship between CDM, PDM, and Class diagram concepts
Table 6-1 Corresponding relationship between CDM, PDM and class diagram concepts
CDM object in the * * * * | Product data management (PDM) object in the * * * * | Objects in a class diagram |
---|---|---|
Entity | Table (Table) | class |
Relationship | Foreign key (Reference) | Class type fields (variables) |
Attributes | Column (Column) | Field (variable) |
Primary Identifier | Primary Key | Field (variable) |
Association | Foreign keys, associated tables | Class type fields (variables) |
(C) C) C) c) | — | inheritance |
Methods, interfaces and polymorphisms describe the behavior of objects and are the rules of system operation. Property describes the state of an object at a point in time. It is a static snapshot. Since the memory is limited and the system cannot stop forever, the state value of the object is stored in the database and then mapped to the object when the system needs to run.
6.2 Generating Database Design Documents
PowerDesigner provides the ability to generate database design documents based on CDM, PDM and other models, reducing the amount of document writing. The generated documents mainly include model diagrams, all list items, and so on. Select the ReportaGenerate Report command to generate documents from the wizard or an existing template.
6.3 Compliance with The Paradigm
Does the database design need to adhere strictly to the paradigm, which paradigm does it need to achieve? This question has been bothering me. In the end, THE answer I got was yes, personally I think the third normal form can be used as a general reference standard for database design.
7 References
Having seen so much, let’s have a rest. Zl: zL: zL: ZL: ZL: ZL: ZL: ZL: ZL
[1] Chen Ping, Chu Hua, Software Designer Tutorial, 2nd edition, Tsinghua University Press
[2] Zhao Shaoping, Xu Maosheng, et al., PowerDesigner System Analysis and Modeling, 2nd edition, Tsinghua University Press
[3] The age attribute in the figure is framed with a dotted line to highlight that it is a non-essential attribute. More on this later.
[4] A special entity that identifies the relationship between entities caused by an event in the system. Such as home visits for teachers and students, rental for customers, shops and films.
[5] Object variables, such as Object and user-defined class. The connection between the two classes is defined in Hibernate by the corresponding HBM.xml.
[6] Contains value types (int, float, char, enum, struct, etc.) and reference types (string).