Click on the top of the “QI QI” attention,” star “grow together
Hello everyone, I’m Yunqi!
Today WE share an article about data governance. What exactly is “data governance”?
In a word, data governance is the whole process of the construction and management of enterprise data architecture, data standards, data quality, data security and other fields. At a smaller level, planning, monitoring, and control over data management and use.
Data quality is often on the table in large companies. But as data workers, even when it comes to journaling, we also practice “data governance.”
This is a data worker in pursuit of perfection, should have the data cleanliness.
01 Data Governance: data value release key
According to a recent survey by Dell Eaton, most companies around the world now recognize the value of data, with the average amount of data under management increasing from 1.45 petabytes in 2016 to 9.70 petabytes in 2018. Globally, 92% of respondents see the potential value of data, and 36% are currently translating data into economic benefits.
With the increasingly prominent value of data, more and more enterprises begin to carry out digital strategic transformation, some through the data platform transition to the data center, some directly build the data center.
The so-called “no rules, no standards”. Due to historical reasons, enterprises have formed a situation of numerous systems in the development process, and the data collected to the data platform have their own characteristics, and the data lacking standards, norms and governance has lost the value of use. In order to standardize the data processing process and highlight the business value of data, it is necessary to comprehensively manage the data of the data platform, build a standardized, process-based, automated and integrated data governance system, and ensure reasonable data structure planning, clear data processing, controllable data processing and transferable data knowledge.
Effective data governance can unlock the value of data assets by ensuring that enterprise data is consistently trusted across the board.
02 Data governance problems
Only by ensuring the standardization, standardization and reliable availability of data, can we further help enterprises to achieve data asset management, discover internal data problems and explore data value through data operation and data application, and then realize the revitalization and effective utilization of enterprise data assets. Data governance should take the simplest approach to managing the most valuable data, but in practice, we have encountered many common “two or three” situations in the implementation of data governance:
1) after the type of governance, inconsistent: * * because of historical reasons, many enterprises adopt the way of “after the” first built, through manual reporting, reverse parsing code, scripts, and other metadata probe into the way, biological detection, data quality management, to found the problem, cause the production content and content management.
2) Passive governance, inefficient: ** builds a quality platform when quality problems are found, and a metadata management platform when data dictionary is needed, which separates the original complete governance system into multiple systems and platforms, resulting in high system integration difficulty and poor governance effect.
3) Erroneous governance, difficult to focus: ** With more and more construction scripts and tasks in the middle stage, the original management data has become management procedures; The essence of data governance is to manage data, which has become management procedures, scripts and tasks, resulting in management loss of focus.
4) Project-based governance, difficult to continue: * * the ultimate goal of data governance is promoting the value of data, is a continuous long operation process, need to gradually improve, the iteration step by step, it is not realistic to expect one pace reachs the designated position to complete the data governance, but often in the process of actual execution on the target of project delivery, unquestioning, lead to governance is not comprehensive, no continuation, the effect also is doomed to be adequate.
5) Part-time governance, difficult to land: * * because of each industry, enterprise, unit organization system, data applications, infrastructure is different, need to find suitable for enterprise through methodology of data governance idea, at the same time need specialist or professional team with strong support, but the actual execution process is often a part-time management for enterprises to arrange the employee, responsibility is not clear, the initiative is not strong, Implementation of governance work is difficult.
03 Data Governance
In the stage of traditional data platform, the goal of data governance is mainly to control and establish a governance working environment for the data department, including standards and quality. Data and in the middle stage, the user demand for data continues to grow, user scope extended to all enterprises, from the data department data governance can no longer just data-oriented sector, have to be geared to the needs of the whole enterprise working environment, all need to enterprise customers as the center, from the perspective of providing services to users, manage data at the same time to provide users with self-service access to large data capacity, Help enterprises complete digital transformation.
By analyzing some problems in the actual implementation of data governance, we have summarized several key elements of data governance:
1) Data governance needs system construction: ** needs to meet three elements: reasonable platform architecture, perfect governance services and systematic operation means in order to give full play to the value of data center.
Choose the appropriate platform architecture according to the enterprise scale, industry, data volume, etc. Governance services need to run through the whole life cycle of data to ensure the integrity, accuracy, consistency and effectiveness of data in the whole process of collection, processing, sharing, storage and application. The means of operation should include the optimization of norms, organizational optimization, platform optimization and process optimization.
2) Data governance needs to lay a solid foundation: ** Data governance needs to be progressive, but at least three aspects should be paid attention to in the early stage of data center construction: data specification, data quality and data security.
Standardized model management is the prerequisite for ensuring that data can be governed, high-quality data is the prerequisite for data availability, and data security control is the prerequisite for data sharing and exchange.
3) Data governance requires IT to empower: ** Data governance is not a pile of normative documents, but needs to implement the norms, processes and standards generated in the process of governance on the IT platform. In the process of data production, data governance should be carried out in a forward way to avoid the increase of operation and maintenance costs caused by post-audit.
4) Data governance needs to focus on data: ** The essence of data governance is to manage data, so it needs to strengthen metadata management and complement the relevant attributes and information of data, such as metadata, quality, security, business logic, blood relationship, etc.; Data production should be managed in a metadata-driven way.
5) Data governance requires integration of construction and management: the consistency of data model blood and task scheduling in ** data center is the key to integration of construction and management, which helps to solve the problem of inconsistency between data management and data production caliber and avoid the occurrence of inefficient management mode with two layers.
04 How to Implement Data Governance?
At the system level, data governance includes six core modules: data standard, metadata, data quality, life cycle management, data security and data assets. At the management level, data governance organizations and data governance processes are required for support. Data governance is a long-term and complex systematic project, which requires a series of process specifications, institutions, IT capabilities and continuous operation mechanisms to ensure the continuous advancement of governance work. The landing proposal for data governance is divided into four stages:
1) Organization building: It is necessary to break the internal barriers of enterprises, build a data governance organization with the participation of multiple departments, and enhance the importance of data governance. Set up special data governance team, including data governance committee, data governance team, business departments and other progressive organizational structure. Support the continuous operation of data governance in terms of performance, team, resources, etc., and achieve the transformation of data strategy system in enterprise data center.
2) Establish specifications: establish practical standardized process specifications, and continue to improve them with the continuous operation of the data center, and implement step by step iteration. The norms include the release of data governance management norms and data governance process norms, the establishment of standardized closed-loop data governance processes, the clarification of online management requirements, and the promotion of data governance through closed-loop operation, online process and centralized service forming a normalized mechanism.
3) Platform selection: Build an effective IT platform to support the implementation of data governance norms, processes and standards, while ensuring the forward data governance mode. Data governance is essentially a management work. Only visualization and intervention of the production process can ensure the effect of data governance. Therefore, the platform should ensure the integration of data governance and data production. Platform should have the ability to develop multiple cooperating factory, data standardization management ability, based on the meta model driven development and management ability, metadata, metadata related management ability and task scheduling based on blood drive capacity, safe and hierarchical classification management capacity and data quality management ability and so on ability to better safeguard data governance fall to the ground.
4) Heavy operation: Data governance is a continuous and long-term operation process. The standards, organizations, platforms and processes need to be continuously optimized iteratively. Data quality and data security need to be continuously controlled.
05 conclusion
Data management is a strategic and long-term, arduous, systematic, ongoing internal data optimization management work, therefore, data governance must be a long and continuous process, not a needle skyshatter tricks of the trade, and no immediate way, only the enterprise continuously, constancy, and don’t forget the beginner’s mind, unremitting efforts, to achieve goals.
Recommended reading:
Fully! Common modeling methods and examples in the field of data warehouse
Did Ali brag about his Big Data Road?
Ali Big data construction – OneData architecture
Data governance is data modeling?
Talk about big data quality control
I am “Qi Yunqi”, a big data development ape who loves technology and can write poems. Welcome your attention!
YunQi QI