The authors introduce
Liang Mingtu, chief architect of New Actions Network, has more than 10 years of experience in database operation and maintenance, data analysis, database design and system planning and construction, and has in-depth research in data architecture management and data asset management.
With the deepening of enterprise IT informatization, enterprises are increasingly dependent on IT systems. Faced with a growing variety of IT systems, IT personnel at all levels of the enterprise have a love-hate relationship. What I love is that various IT systems of enterprises become the booster of enterprise business and improve the efficiency of enterprise business and management. Unfortunately, as enterprises become more and more dependent on IT systems, IT operations has been pushed to the forefront. How to ensure that the IT system is efficient, stable, continuous, and even 7×24 hours uninterrupted service has become an urgent problem for IT personnel at all levels in enterprises.
IT operation and maintenance refers to the comprehensive management of IT software and hardware operating environments, IT business systems, and IT operation and maintenance personnel by using relevant methods, means, technologies, and systems. With the development of technology, IT operation and maintenance has undergone earth-shaking changes in recent years. The following is a summary of the development of IT operation and maintenance in recent years, and the overall trend of IT operation and maintenance in the future.
I. IT Technology Architecture: From “IOE Architecture” to “Internet Architecture”
Why start with the technical architecture? Political economy is summed up like this: “the economic base determines the superstructure”, I think the same applies to THE IT industry. The evolution of the technology architecture foundation will inevitably lead to changes in other fields, including the IT operations aspect we discussed.
Once upon a time, commercial minicomputers represented by IBM, commercial databases represented by Oracle, and high-end storage designs represented by EMC were the standard configurations of enterprise IT systems. More than 10 years ago, I visited the computer room of a provincial operator. Almost all IBM minicomputers were black. Their system database is an Oracle enterprise database regardless of size and purpose.
In retrospect, why did the enterprise at that time tend to this IOE architecture? At that time, there was nothing wrong with such a choice. Even Ali, which was called “go to IOE” the most, actually had IOE as its initial technical architecture. Under the premise that distributed technology was not mature at that time, IOE, a mature foreign commercial software and hardware product, did bring unmatched single-machine stability and high performance compared with other products in the same period.
I once saw an old small machine that was about to go offline at a customer’s site. I checked the startup time before shutting down and was surprised to find that the last startup time of this machine was more than 3000 days ago, that is to say, this small machine has served for nearly ten years without failure or shutdown. Many organizations pay a lot of money for this stability and performance, because “stability above all else” is the fundamental need for IT operators.
In addition, considering the technical factors, when the IT system operation and maintenance was still human-dominated, the single structure of the system technology stack was also conducive to the formation and cultivation of the development and operation and maintenance team. For example, one or two Oracle gurus with a few mid-level DBAs can take care of all database-related issues, which is obviously a very cost-effective option.
However, with the development of technology, the traditional centralized system architecture provided by the “IOE” architecture based on the high-end commercial products of upscaling technology has reached a bottleneck. In particular, the continuous in-depth research on the technology architecture of Internet enterprises has brought a new technological model change for the IT industry. The reasons behind the vigorous technological revolution initiated by Internet enterprises come from the following factors:
-
Cost: Cost has to be considered, as you can get a truckload of X86 servers for the price of a minicomputer.
-
Flexibility: Due to the changing business characteristics of the Internet industry, the technology architecture needs to be timely and on demand, which is obviously difficult to achieve with the centralized ioE-style architecture.
-
Scalability: The technical characteristics of centralized to vertical scaling have begun to limit the business development needs of Internet enterprises. The characteristics of rapid business development of Internet enterprises make them need a more flexible and easy to expand horizontal expansion of cloud technology architecture.
-
Technology control: The Internet industry gathers all kinds of technical elites in the industry, who need a more open technology environment to make all kinds of extreme transformation for their different business scenarios. For example, they clearly need a small sports car that can be adapted for them rather than a staid minivan. This is obviously also closed source commercial hardware and software devices do not have.
With the development of technology, this kind of cloud, distributed, open source technology architecture begins to enter the sight of traditional enterprises. In September 2014, the China Banking Regulatory Commission (CBRC) issued document No. 39, namely, The Guiding Opinions on strengthening the Network security and informatization construction of the Banking industry by using secure and Controllable Information Technology. In the following years, traditional enterprises started to go to IOE and learn from the Internet architecture.
The Architecture of the Internet is not a mystery. It can be summarized as follows:
-
X86-oriented and open source software: a large number of domestic X86 servers are used to replace expensive foreign minicomoters and storage, and open source software is used to replace closed source commercial software, thus saving the cost of mass procurement, licenses and original maintenance. For example, “buy a cow for the price of an elephant.”
-
Distributed: Supports distributed computing capability in architecture, replacing the capability of a single minicomputer in a centralized architecture with the sum of the performance of multiple machines. To continue with the above metaphor, “instead of an elephant hauling wood, there are dozens of cows.”
-
System reliability: Add necessary redundancy in the architecture, and replace the reliability of a single device with the overall system reliability if the single device is unreliable. To use the above analogy, “One of the cows in the process of pulling wood is sick, so a new cow should be replaced immediately, but the progress of pulling wood will not be affected”.
-
Highly scalable: The architecture is designed to support continuous addition of resources to achieve greater capacity, support higher concurrency, and accommodate more users. “When pulling wood becomes pulling stone, all you have to do is increase the number of cattle.”
Therefore, under the impact of emerging technologies such as Internet architecture, cloud computing, and big data, enterprise IT technology architecture is gradually changing from a single IOE architecture to a variety of technologies such as x86, cloud architecture, and open source solutions (see Figure 1-1). This innovation of technical architecture will inevitably bring about innovation of other key factors in the field of operation and maintenance, and promote the development of “operation and maintenance” industry.
Figure 1-1 Moving from IOE architecture to Internet Architecture
Ii. Operation and maintenance system: From ITIL to DevOps
The continuous innovation of the enterprise’s technical architecture promotes the transformation of the IT operation and maintenance management mode from a steady state to a sensitive state.
With the deepening of enterprise informatization, there are more and more IT systems, and the number of enterprise IT operation and maintenance personnel also increases. Many enterprise information departments set up operation and maintenance teams specially for IT system operation and maintenance. There is an inevitable need to manage the activities of operations staff within THE IT team. ITIL is for the enterprise IT service management to provide an objective, rigorous, quantifiable best practice standards and norms. I think it is these standards and norms put forward by ITIL that have pointed out the direction for the operation and maintenance system construction of many enterprises in China for a long time.
ITIL emphasizes process: all ITSM systems with ITIL concept as the core will streamline operation and maintenance. Event management, problem management, change management, configuration management, everyone in accordance with the process, put an end to all head decisions and blind operation.
ITIL emphasizes norms: operation and maintenance personnel carry out various standardized operation and maintenance operations according to the organization process, and the constraint itself is to ensure that everyone’s behavior does not deviate from the direction and make fewer mistakes.
ITIL emphasizes the division of labor: operation and maintenance personnel according to skills for effective division of labor, some are responsible for front-line response of the service desk, some are responsible for second-line events and problems, some are responsible for configuration management, some are responsible for change approval and so on. Internal operation and maintenance team to achieve their own responsibilities, division of labor and cooperation.
This management mechanism is very appropriate in the era of IOE technology architecture. This centralized technology architecture is relatively simple, and obviously needs to be more stable operation and maintenance. After all, all eggs are in these baskets. In addition, under this centralized architecture, business changes are not so frequent, and the need to go through a process at every turn is a bit troublesome, but it is acceptable because of the low frequency.
However, in the enterprise IT technology architecture gradually into the Internet architecture, the rapid development of business, emphasis on IT to better change on demand, emphasis on more agile response to the needs of the business, ITIL this system is somewhat incompatible with the reality of feeling. At this point, the term DevOps came into view (see Figure 1-2).
Figure 1-2 Moving the O&M system from ITIL to DevOps
DevOps (a combination of Development and Operations) is a group of processes, methods, and systems used to facilitate communication, collaboration, and integration between Development (application/software engineering), technical Operations, and quality assurance (QA) departments. It comes as the software industry increasingly recognizes that development and operations must work closely together in order to deliver software products and services on time.
The thinking of DevOps is naturally different from ITIL
Process compression, quick response, greatly improved efficiency:
ITIL emphasizes process, but it also reduces efficiency. In the ERA of IOE, enterprise business changes are not that frequent, and the decline in efficiency is not obvious. But in the Internet architecture, this negative effect will be infinitely amplified.
For example, when a carrier releases a new system version, it usually goes through the process of source code submission, compilation, packaging, release to the test environment, UAT testing, bug fixing, retesting, and finally release, which usually takes 3-4 days. As a result, the carrier’s releases can only be monthly or, at the earliest, weekly. Compared with the Internet industry, where business cycles are measured in days, the system is slow to respond to business changes.
Therefore, DevOps system lays more emphasis on efficiency. With the support of continuous integration, continuous automated testing, continuous deployment platform, three-dimensional monitoring, technical architecture optimization and other automation tools, the process of release, operation and maintenance is greatly compressed and the efficiency is greatly improved. Release rates can be measured in days or even hours. This selective abandonment of somewhat sluggish process management for the sake of efficiency is a better choice for IT operations management to adapt to IT better on demand, emphasizing a more agile response to business needs.
Automation instead of standardization under lengthy process control:
On the other hand, ITIL emphasizes standardization, but this standardization based on process still has many defects.
Following the example of the operator above, no one can guarantee that there will be no problems with the launch of the version, even if there is a better process to control and standardize it. Before and after the release of each version, the operation and maintenance team members are still afraid of the enemy.
The reason is that at a certain point in the complexity of the technical architecture, processes are often useless or even mere formality. In the large-scale and multi-type operation and maintenance of hardware and software facilities, the operation and maintenance system relying solely on human will eventually become the bottleneck of the entire IT operation and maintenance. In this context, many enterprises attempt to refine normative operations into various automated operations scenarios, such as tools and platforms for continuous integration, continuous automated testing, continuous deployment, automated monitoring, and operations, as mentioned above. These high efficiency, standardized automation completely free the pressure of operation and maintenance personnel, so that the energy of operation and maintenance personnel can be put into the real meaningful work, instead of repeating some mechanical and repetitive routine affairs.
Google’s SRE engineers, for example, mandate that only 30% of their time is spent on transactional work like ON Call, while 70% is spent on the development of various automation tools such as automated publishing systems, monitoring systems, logging systems, server resource allocation and orchestration, etc. These tools need to be developed and maintained by themselves. This replacement of standardization with long process control by efficient automation with automated tools is also an obvious feature of DevOps systems.
Integration of development and operation:
At the same time, the division of labor under the ITIL background also brings many negative problems. For example, the operations team has poor perception and identity. Senior leaders of enterprises think that operation and maintenance work is a cost department with no bright spot and value. Operations teams also tend to think of themselves as “bad guys.” So many years ago, when I was working on a project, I heard a complaint from a core member of the operation and maintenance team of a certain party A: “If the young don’t work hard, the old do the operation and maintenance”.
This is probably the sentiment of most operators. Admittedly, there are some factors such as hard to quantify the operation and maintenance work results and insufficient attention from the senior management of the enterprise, but the division of development and operation with too clear barriers is also one of the important reasons.
The gap between the enterprise development team and the operation and maintenance team makes the development team pay too much attention to the implementation of functions in the process of planning, design and R&D, and to some extent ignores the stability, performance, availability and other factors that the operation and maintenance team cares about.
At the same time, the operation and maintenance team has no channel to feedback and fix these problems in the early stage of development. As a result, the operation and maintenance team has been reduced to the “firefighter” and “piggy-back man”, resulting in a vicious cycle of reduced team morale and loss of talent and decreased operation and maintenance quality.
So, the DevOps architecture emphasizes the integration of development and operations.
The integration of development and operation makes the information of development and operation transparent, and the problems encountered in operation and maintenance can be fed back to the development team more effectively. At the same time, the main body of responsibility for operation and maintenance changes from single operation and maintenance team to development and operation and maintenance team. This makes the development team also responsible for the failures encountered in the operation, so that the development team also needs to invest part of the energy and resources in the stability, performance, availability and other operations related to the development.
Of course, it is not that the ITIL system has been completely outdated, but we need to combine the two with the characteristics of enterprise development operation and maintenance, to form a more effective development operation and maintenance system suitable for the enterprise itself. Only suit oneself is the best.
Operation and maintenance platform: from ITOM to AIOps
As the saying goes, “To do a good job, you must first do a good job.” Operation and maintenance tool is an effective helper for us to achieve various operation and maintenance operations. IT liberates operation and maintenance personnel and enables them to maintain various IT systems more and better. Of course, the development of operation and maintenance system is inseparable from the development of operation and maintenance tools.
More than 20 years ago, enterprise IT informatization just started, IT operation wiki was still in the era of slash-and-burn, there was no so-called operation tools and no awareness of the necessity of its existence. A few little girls regularly typed commands on a terminal and meticulously recorded readings on a paper form, which was standard operating practice at the time. The reason was that the amount of IT systems needed to be maintained at that time was so small that IT could be seen by people alone.
In the era of IOE architecture dominance, manual maintenance is still the majority of operations team. Of course, some of them began to summarize their operations, some common operations into a large number of scripts in order to engage in some mechanical, repetitive things can be “lazy”. However, manual operation and maintenance still account for most of the workload at this stage.
In the late era of IOE architecture and the popularity of Internet architecture, as well as the deepening of IT informatization in enterprises, the number of IT equipment in enterprises presents an explosive growth, and human resources alone are gradually unable to manage IT.
In the case of one carrier client I worked with, the initial business support department was responsible for maintaining its core systems, with only about 20 mainframes and a few databases. However, in the following years, the size of the maintenance system increased by more than ten times, while the size of the operation and maintenance team only increased by less than twice. The maintenance scale and the ability of the operation and maintenance team will only form an increasingly obvious scissors difference in fact, which has become the core contradiction in operation and maintenance management.
Later, when enterprises began to introduce Internet architecture, the complexity of the system increased sharply, and the maintenance target increased rapidly. It was impossible to do it according to the traditional manual or semi-automatic maintenance. Therefore, in order to solve this problem, enterprises try to introduce various operation and maintenance tools to solve the problem of insufficient manpower and ability of operation and maintenance by means of automation, and IT operation management comes into being.
IT Operations Management (ITOM) is a service that monitors and manages the operations of IT infrastructure and software applications in real time and provides feedback to ensure the optimal running status of the monitored objects. Tools in the ITOM space fall into three main categories:
-
Monitoring: Provides automatic monitoring and alarm services, such as application performance monitoring, basic software service monitoring, host storage device monitoring, and network device monitoring. For example, Tivoli in commercial software and Zabbix in open source software are examples.
-
Management: all kinds of software services that provide IT operation and maintenance support services and configuration management, such as various ITSM systems and CMDB software systems, such as HP OpenView.
-
Automation: Tools and software that provide automatic operation and maintenance (O&M) methods, such as Ansible and Puppet.
IT Operations Management (ITOM) will be transformed from the original manual plus passive response to a more efficient and automated operation and maintenance system.
As for the operator customers mentioned above, the growth of operation and maintenance manpower cannot match the growth of IT system scale, so IT is difficult for the enterprise to maintain a routine status inspection of all IT system equipment before large-scale operation every morning.
To solve this contradiction, we specially deployed and implemented our automated monitoring and operation platform, transferring a large number of routine operations to machines. Just like the daily inspection action, only need to define the relevant inspection template, the machine will carry out various inspection operations according to the specification defined by us every day for ten years.
If any exception occurs in the inspection result, an alarm message is displayed on the mobile phone of the O&M personnel, notifying the o&M personnel to rectify the fault. The essence of this automated operation and maintenance tool system is to let the machine manage the machine, and transfer a large number of repetitive and mechanical operation and maintenance work to the machine, which effectively reduces the input of operation and maintenance human resources, and also allows the energy of operation and maintenance personnel to be released and directed to more important fields.
Recently, I talked with the person in charge of the operation and maintenance team again, and learned that in fact, 80% of their operation and maintenance operations are done automatically by machines. Finally, he laughed and said, “In fact, in addition to dealing with sudden system failures, the most common task of our operations team is actually creating accounts and assigning permissions to various people in the enterprise for the application system, and we are now developing code to automate this as well.”
ITOM system brings automation to operation and maintenance, making IT operation and maintenance more efficient. However, ITOM still fails to break the dependence of operation and maintenance work on the experience of operators, and often lacks analytical ability. Although it can collect operation and maintenance data, it cannot gain insight into the information contained in the data, let alone improve the essence of knowledge of the data.
For example, in the process of processing and analysis of various faults, operators still rely on their experience and even intuition to analyze and deal with them, and various head-scratching examples in operation and maintenance decisions still emerge endlessly. This is because traditional ITOM tools often lack data analysis capabilities. Although part of operation and maintenance data can also be collected, due to incomplete data collection, incomplete data integration, lack of data connection and analysis means, operation and maintenance personnel cannot have insight into the information contained in these data, let alone carry out the essence of knowledge improvement behind operation and maintenance.
Therefore, operators began to explore ITOA based on operation data analysis. The maturity of big data technology makes it possible to analyze massive operation and maintenance data. Referring to examples in the field of business analysis, we started to establish a comprehensive operation and maintenance data analysis system from operation and maintenance data collection, processing, analysis and visual display. Our OPERATION and maintenance IT system generates massive data all the time, and the amount of data IT generates may even exceed that of our application system, so operation and maintenance analysis is naturally a big data application scenario.
Realize ITOA based on operation and maintenance data
The first problem to solve is data collection:
Because the data in the operation and maintenance system is diverse, there are structured data directly collected by the monitoring system, as well as unstructured data such as various application logs and machine logs.
In order to facilitate our subsequent data analysis, we need to convert the unstructured data that is difficult to analyze into structured data for storage. For example, Figure 1-3 shows a line in the Apache Web log, which contains a lot of useful information, such as the customer’s IP address, client used by the customer, page information accessed by the customer, and access time.
Figure 1-3 One row in the Apache Web log
We use effective tools to segment these information and form structured information, which is continuously stored in the operation and maintenance big data center, as shown in Figure 1-4:
Figure 1-4 Structured information
The development of big data technology also provides us with the basis for storing massive operation and maintenance data:
We can build our operation and maintenance big data center through the big data platform, and the operation and maintenance data collected from our entire OPERATION and maintenance IT environment will be stored and integrated on this basis. In this way, we can change the defect of scattered data and difficult correlation analysis in ITOM system, because data needs more connections and associations to give full play to the value behind it.
For example, an isolated event in an ITSM system might be hard to see, but from an operational data analysis point of view, it might be compared to a series of similar events in history to find changes in various data indicators at nearby points in time. Through layers of screening and analysis, the operation and maintenance personnel finally found the rules behind the operation and maintenance data and summarized the knowledge base and related optimization actions. This is a good result of letting data speak for itself and replacing empirical decision making with data analysis.
Data retrieval capability and data visualization capability are guaranteed:
Of course, in addition to simply providing a carrier for big data storage and analysis, operation and maintenance data analysis also needs some necessary capabilities to ensure that operation and maintenance personnel can make better use of the operation and maintenance data:
Platforms need to have strong data retrieval capabilities. Operation and maintenance data analysis platform stores massive operation and maintenance data. When operation and maintenance personnel try to establish and verify an exploratory scenario, they often repeatedly retrieve and query specific data. If the data query of the operation and maintenance data analysis platform is very slow or the query Angle is few, the operation and maintenance personnel will have a long time to build the scene or even fail to proceed. Therefore, operation and maintenance personnel can realize keyword, statistical function, single condition, multi-condition, fuzzy multidimensional search function and massive data second-level query through the platform, which can help operation and maintenance personnel more effectively and conveniently analyze data.
The platform requires strong data visualization capabilities. People often say that “one picture is worth thousand words”, operations staff are often through the operational data for statistical analysis of each system and produce various kinds of real-time reports, for all kinds of operational data, such as application log, transaction log and system log) multidimensional, multi-angle in-depth analysis and visual display, will be the results of their analysis and experience to show others and promotion. Therefore, it is important to have the ability to rotate pivottables and regular reporting in the platform.
Applicable to various business scenarios:
In addition, operation and maintenance data analysis is not only used in operation and maintenance, but also in our experience in risk analysis, audit, sentiment analysis and other business scenarios. By collecting operation and maintenance data in the current environment, integrating existing ITOM tools, using big data and data analysis technology, IT can quickly locate, troubleshoot and predict problems in each link of the IT system. Make an overall analysis of the data from each distribution system in the business link, optimize IT services reasonably, dig out key business KPI indicators, feed back to the business end and help IT make wise decisions.
The analysis of IResearch Predicts that the market size of ITOM/ITOA will reach 11.45 billion yuan by 2020 (see Figure 1-5), but the growth is gradually slowing down, and AIOps is the continuation of ITOM and ITOA.
Figure 1-5 IResearch predicts that ITOM/ITOA market in China will reach 11.45 billion yuan in 2020
Analyze logs and operation and maintenance data through big data and artificial intelligence technology to discover more potential system security and operation and maintenance problems that operation and maintenance personnel are not aware of.
Gartner first proposed the concept of Algorithmic IT Operations in its 2016 report. With the rapid rise of artificial intelligence, Gartner expands the concept of AIOps from data analysis to ARTIFICIAL intelligence, expecting to provide initiative, humanization and dynamic visualization capabilities through big data, modern machine learning and more advanced analysis technologies. Directly or indirectly enhance current traditional IT operations (monitoring, automation, help desk) capabilities.
The real application and landing time of AIOps is still very short. In terms of current applications, it mainly applies machine learning algorithms to carry out various data analysis and mining on the basis of centralized operation and maintenance data. The main application scenarios include:
-
Abnormal alarm: According to the historical monitoring indicator data, the relevant algorithms based on time series are used to analyze the abnormal monitoring indicators, and accurate alarms are issued for abnormal monitoring indicators.
-
Alarm convergence: according to the historical events and alarm data, found that the relationship between these events and alarms, integration of frequent events and alarms, and recognize it as the same kind of fault alarm, which combine multiple alarms and index, pushed to operations staff, do fine alarm, avoid the traditional monitoring tools of alarm storm, as a result of a failure alarm noise production.
-
Fault analysis: Based on o&M data, events and alarms, combined with the experience knowledge base and model of previous problem discovery, fault tree analysis is established. Combined with decision tree and other related algorithms, the path derivation enables O&M personnel to locate problems more quickly and intuitively, making it easier to solve problems.
-
Trend prediction: Perform historical data fitting and other algorithms to predict resource trend/capacity. For example, insufficient host CPU, switching page, memory, and storage will gradually lead to system failure or application failure. The system establishes an association model to remind users of possible subsequent system failure or application failure. Inform O&M personnel to rectify the fault before it affects services.
-
Fault portrait: By collecting multi-dimensional operation and maintenance data, a multi-structured underlying operation and maintenance data model is constructed to coordinate with various operation and maintenance scenarios, and faults are depicted in the scenarios. Various standard forms of fault portraits are used to assist enterprises in IT operation and maintenance decisions and processing processes.
Of course, AIOps can be used in many more scenarios than that, and because the concept has been around for a relatively short time, there is more room for us to explore. In general, the development path from manual operation and maintenance, ITOM, ITOA and AIOps reflects the main development trend of operation and maintenance automation, data to intelligence.
Iv. Operation and maintenance core: From platform to data assets
The transformation of enterprise technical architecture has triggered the transformation of operation and maintenance management mode, and operation and maintenance tools are constantly advancing with The Times.
In general, IT system operation and maintenance is moving towards automation and intelligence. As for IT operation and maintenance work itself, I think the difficulty of operation and maintenance work is decreasing, and so is the workload of operation and maintenance. After all, most of the workload is handed over to machines. What is our future direction, or way out, as IT operators?
In classic enterprise architecture, different enterprise architecture framework theories have different perspectives, but they generally have the same hierarchy of enterprise architecture content, basically describing enterprise architecture from the following aspects (or at least including the following aspects) :
It is generally divided into business architecture, application architecture, data architecture and basic technology architecture from top to bottom. Traditionally, the main object of IT system operation and maintenance is a variety of hardware and software platforms in the enterprise IT environment, such as hosts, storage, databases, middleware, etc. Enterprise IT operations teams typically focus on the technical architecture level with a small number of application architecture levels (see Figure 1-6).
Figure 1-6 Enterprise IT architecture model of the TOGAF Open Group Architecture framework
However, in the continuous development of society, the enterprise technical architecture based on innovation, cloud, open source, high elastic Internet architecture technical architecture gradually become a mainstream enterprise architecture, the emergence of a large number of new technology and application of the centralized centralized system architecture is broken, the system architecture increasingly cloud and distributed architecture.
First of all, the distributed architecture and cloud architecture break the single point of the system. With the improvement of the overall data stability, the demand for the stability of a single device decreases. Under this premise, data architecture is more important, and more data architects and operation and maintenance personnel are needed to participate in the early system business architecture analysis, data architecture planning, data architecture design, data model design and other work.
Secondly, as mentioned above, tools and products related to operation and maintenance are constantly improving. The emergence of centralized, automated and intelligent operation and maintenance products and tools makes IT possible for the intelligent and automated operation and maintenance of IT systems, releases the operation and maintenance personnel from repetitive and mechanical work, reduces the workload of operation and maintenance personnel, and enables them to undertake more important work.
In addition, the various soft and hard products are constantly improving themselves. It is a trend to use and maintain “simple and stupid” in hardware and software products:
-
For example, Oracle database in the Oracle 9I era, after the installation of the database, maintenance personnel only need to configure a lot of system parameters for memory allocation;
-
Over the years, with the introduction of Oracle11g, a memory_target that tells it how much memory you plan to use, operations have become much easier;
-
Until now, Oracle put forward the concept of database autonomy in the release conference of Oracle Database 18C, claiming that Oracle has become the world’s first autonomous database, and its corresponding cloud platform and services have realized the requirements of higher performance, security and reliability at the lowest cost, and reduced the complexity of operation. Reduce the probability of man-made wrong operation, able to independently complete most of the work, reduce the workload of manual operation, make the industry a “operational risk” exclaimed, despite how the actual result remains to be seen, but the future of the software and hardware products is more and more intelligent, basic operational difficulty falling is an indisputable fact.
Finally, with the wide application of information technology especially the Internet, online shopping, mobile payment, sharing economy, smart home and the vigorous development of the new model of new forms, global growth, the characteristics of mass gathering data present outbreak, every year to produce more than ever, more rich dimension of huge amounts of data, better data management, better use of the data, Data asset management is the core of constructing digital economy with data as the key element.
In the trend of data assets, the focus of enterprise IT system operation and maintenance is bound to change from a single guarantee of stability to higher data asset management and operation requirements such as data asset realization and value-added.
The application of business-side data assets is problematic
However, there are still many problems restricting the application of enterprise data assets.
-
Many traditional enterprises, due to their own characteristics, enterprises do not have highly specialized IT system construction and management system. Decentralized IT system construction, resulting in silos or smokestacks. The lack of standardized data standards and systems for the construction of internal IT systems in enterprises results in low data quality, and even the integration and sharing of basic data are difficult, making IT more difficult to conduct further data analysis, mining, data realization and other behaviors. Scattered and scattered data, resulting in a large number of independent data barriers, it is difficult to give full play to the synergistic effect of data, expand the scale of data, and further increase the value of data analysis and exchange.
-
In the current traditional enterprises, limited by capital, technical capacity, personnel and other limitations, IT system construction is mainly outsourced development. In the whole life cycle of IT system from planning, design, development, on-line to daily operation, outsourcing development plays a leading role in technical architecture, data architecture, application architecture and even business architecture. This decline in the ability to control the core of enterprise IT systems also makes many traditional enterprises gradually lose the dominance and control of data assets.
Enterprises are weak in data realization and lack of professional technical ability in data application and operation, so it is difficult to complete the application scenarios of predicted data.
Future trends of operations personnel
As the interface between IT technology and business, operation and maintenance personnel are required to step up to the level of data asset management.
Data asset management is a set of business functions that plan, control, and provide data as an enterprise asset. It includes developing, implementing, and overseeing plans, policies, programs, projects, processes, programs, and procedures related to data to control, protect, deliver, and enhance the value of data assets. Without high-quality data, it is difficult for companies to make informed and effective decisions.
In the era of big data, data asset management is more important than the traditional era. It provides enterprises with a transparent, reliable and high-quality data environment. It will become the core competitiveness of enterprises, helping enterprises to provide more accurate products and services, reduce costs and control risks. We summarize enterprise data asset management as a five-star model of data asset management, which is divided into five interrelated levels, namely, data architecture, data governance, data operation, data sharing and data realization (see Figure 1-7).
Figure 1-7 Five star model of New Actions network data Asset management
As times change, the work priorities of operation and maintenance personnel also need to change with time, which is an invariable law. Taking data assets as the core, governance and operation as the means, and sharing and realization as the goal is the general trend of enterprise operation and maintenance personnel from infrastructure operation and maintenance to data assets as the core.
Five, the summary
With the development in recent years, the construction and operation and maintenance of enterprise IT application system has gradually changed from business-oriented to customer-oriented. Traditional IT architecture, operation and maintenance mode, operation and maintenance system and even operation and maintenance objects have been impacted and changed to varying degrees.
In the process of transformation, enterprise IT operation and maintenance is faced with overlapping business requirements, shortening the delivery cycle of application requirements, increasing user experience requirements, and increasing value of data assets. On-demand, become the current enterprise application system change the theme, which requires the enterprise to have a more flexible and highly scalable technical architecture, more agile and efficient operational system and more intelligent system of operational tools, can more quickly response from the client’s business needs, to meet the demand of users at the core of the whole enterprise “as a common vision.
At the same time, the intelligent system of operational tools based on the digital operations, through the big data, machine learning, and more advanced analysis techniques such as artificial intelligence, have the initiative, humanization and dynamic visualization ability, directly or indirectly promote the ability of the IT operations, operations staff to more automated operations operating liberation, Let operation and maintenance personnel have more input in other work such as data analysis, promote the development of the core business of the enterprise.
Finally, the focus of enterprise IT system operation and maintenance returns from technical architecture to information itself. Enterprises need high-quality and reliable data for their decision support, operations management, risk control, product offering, marketing activities and other services. Operation and maintenance personnel are the ideal managers and promoters of enterprise data assets at the junction of technology and business. The focus of operations personnel will shift from technical architecture to data architecture to a large extent in the future.
The transformation of operation and maintenance will be carried out from four dimensions of technical architecture, operation and maintenance system, operation and maintenance platform and operation and maintenance core step by step. On-demand, intelligent and data-driven will be the general trend of IT operation and maintenance in the future.