preface
As a veteran of more than two decades in operations and now in data center management, many of the scenarios described in these articles are vivid. Once upon a time, operation and maintenance became a synonym for helpless pain and carrying the pot. They worked overtime at any time, responded to emergencies, and carried out environmental construction and version production around the clock.
As a veteran of more than two decades in operations and now in data center management, many of the scenarios described in these articles are vivid. Once upon a time, operation and maintenance became a synonym for helpless pain and carrying the pot. They worked overtime at any time, responded to emergencies, and carried out environmental construction and version production around the clock. In the dead of night, when everyone else is asleep, operations people get to work. And when there is a problem in the production environment, it is the operation and maintenance personnel who are questioned first. On the one hand, the operation and maintenance personnel often work overtime and suffer from invisible pressure for a long time. On the other hand, other departments do not know the value of the operation and maintenance department. “If there is a problem with the product, ask the company, if there is a problem with the application, ask the development center, ask the business department for accounting treatment, and what does the operation and maintenance department do?
With the rise of cloud computing, hardware maintenance, system management and application development tend to be centralized. Devops makes the boundary between development and operation no longer clear. Intelligent Operation (AIops) will fundamentally change the mode of operation and maintenance. Is the operations industry really going away? Where is the value of operation and maintenance people? How should operation personnel upgrade and transform?
With the increasingly extensive application of information technology in all walks of life, the social demand for IT operation and maintenance is increasing. Meanwhile, due to the popularity and maturity of cloud computing and other technologies, flexible supply, pay-on-demand, rapid response and other characteristics make enterprises more willing to go to the cloud, and the demand for traditional operation and maintenance personnel is not increasing but decreasing. The impact of technological progress on traditional industries is fatal, just as e-commerce promotes the development of retail, while traditional retailers gradually wither. Infrastructure as a Service (IaaS) reduces the need for infrastructure maintenance personnel, platform as a service (PaaS) reduces the need for middleware and database maintenance personnel, software as a service (SaaS) has a great impact on application developers, and Intelligent Operations (AIops) is claimed to eliminate low-end operations.
Operation and maintenance personnel have already felt great pressure from the development of cloud computing, big data, artificial intelligence and other technologies. In the highly standardized cloud computing environment, there are no obstacles to complete o&M work through automated operation and intelligent operation and maintenance, and amazing results have been achieved in a single vertical field by developers to complete o&M work. In companies like Alibaba and Tencent, the responsibilities and composition of operation and maintenance personnel have also undergone earth-shaking changes. Instead of simple operation and maintenance engineers, there are operation and maintenance development engineers. We summarized the operation and maintenance experience into knowledge map, and used the intelligent operation and maintenance platform to maintain, monitor, emergency and analyze the complex application environment.
Society needs IT operation and maintenance service, but IT does not necessarily need full-time low-end operation and maintenance personnel. Only in a complex and changeable environment, operation and maintenance personnel will truly reflect their advantages and play a role, which is an irresistible trend of development. The only way for operation and maintenance personnel not to be eliminated by history is to practice their internal skills, upgrade and transform, and truly reflect their own value.
So, what is the value of operation people? According to the software life cycle theory, the operation and maintenance of products occupy more than 70% of the whole life cycle time, which is also the most important stage to reflect the value of products. Operation and maintenance personnel should be responsible for the final operation effect of products, so as to reflect their own value.
Change expense center into profit center
The operation and operation of the product need expenses. Operation and maintenance department is the traditional sense of the cost center, is the spending department. Generally speaking, the operation and maintenance department of a large company has hundreds of employees and many outsourcing contracts have been signed at the same time. We can sum up and calculate the total cost of annual expenditure. If the company outsources operation and maintenance to Tencent or Alibaba or even r&d staff, how much will they charge us while maintaining the same SLA? If we spend less on our own operation and maintenance, we save money for the company; otherwise, we waste the company’s resources.
Of course, this statement is too idealistic. While measuring the overall cost, it also needs to consider security risks, policy constraints, career stability and other factors. It is not comparable on the whole and only has theoretical significance. We can refer to companies of the same scale in the industry, whose personnel scale, service level and other conditions are basically similar to ours. Their fees can be used as a reference for evaluation. In individual comparability is relatively large, a person is now a monthly salary of 20,000 yuan, if he can apply for a monthly salary of 30,000 yuan in the society, then we earn, on the contrary, we lose. Of course, stability, brand, word of mouth, risk appetite and other factors should be considered.
We should strive to improve the level of operation and maintenance management, improve the technical ability of operation and maintenance personnel, and maximize personnel efficiency and input-output ratio. At the same time, we can export operation and maintenance technology according to our own technical advantages, and change the expense center into a profit center. At present, public cloud service is the output of operation and maintenance technology. It contracts the infrastructure operation and maintenance, platform operation and maintenance and even application operation and maintenance of the user company, promises service quality (SLA) and charges corresponding fees. Operation and maintenance management consulting, operation and maintenance tools, operation and maintenance project management, system tuning and technical support can all become the output content of operation and maintenance technology, which adopts different modes of pay-per-view and pay-per-performance.
Second, improve operational efficiency
O&m personnel are exposed to a large amount of system information every day. They can summarize the system operation rules by analogy and make reasonable maintenance plans to ensure stable system operation. O&m personnel can also observe the weakest links in system operation and optimize the system to improve the high availability and efficiency of the system. Finally, the operation and maintenance personnel abstracted the operation and maintenance non-functional specifications from a lot of practical work, guided the development department to fully consider the operation and maintenance needs in the development stage, and reduced or eradicated the hidden problems of the system from the root.
Just because a product is developed and brought to market doesn’t mean everything is fine. Due to a variety of reasons, such as lack of consideration and rush of time, the overall operational efficiency of a product after it goes into production is not necessarily good. Operation and maintenance personnel can analyze and study data generated in the process of use, events occurred, and customer feedback, and propose optimization and improvement measures to improve the overall operation efficiency.
For example, when issuing CDS, the business department of a bank still uses OA system for authorization of the head office and horizontal communication with related departments through telephone communication, which is very inefficient. The application system already has the function of automatically opening products through linkage, which can shorten the opening period of products from one month to two days. O&m personnel should timely identify these problems in the daily o&M process, take the lead in coordinating various business departments, straighten out the relationship between all parties, and improve operation efficiency through multi-party interaction.
Three, for all kinds of customers to empower
O&m personnel can provide services to four types of customers: end users, business departments, development departments, and managers. Provide good experience for end users by ensuring stable operation of the system and rapid emergency response; Empower business units for marketing and decision-making by providing data services and visualization; Through non-functional specification and new technology architecture discussion to create a good and convenient development environment for researchers to empower; Empower managers with system capacity management, system performance optimization and management process improvement through operation and maintenance big data analysis. Of course, we can also empower ourselves by building an intelligent operation and maintenance platform to complete operation and maintenance work more efficiently and ensure the stable operation of the system.
Through the realization of their own value, operation and maintenance personnel can live with more dignity, which comes from the support for business development and the profound technical strength. The operation and maintenance department needs to carry out innovation and transformation in five aspects: mode of thinking, institutional innovation, department responsibilities, tool platform and skills. These five aspects not only have the characteristics of the overall promotion of the company, but also emphasize the particularity of the department. Operation and maintenance personnel can start from their own small environment, avoid the thought of “waiting to rely on”, dare to be the first, brave to take the lead.
First, operations people should change their thinking
On the one hand, regulators and companies constantly raise requirements for safe and stable operation; on the other hand, the rapid development of business brings great pressure on operation and maintenance, with frequent application version changes and various defects, while the operation and maintenance department is powerless to restrain the development department. Operation and maintenance personnel are often in the conflict between safe and stable operation and rapid business development.
When you look at it from the company’s point of view, these two seemingly contradictory requirements become reasonable. Business development depends on a stable background support, to give customers a good experience, and the company in the market must continue to innovate, and competitors than speed, than innovation, so that the company can survive, retain the opportunity for further development. As a member of the company, operation and maintenance personnel should obey and serve the overall goal of the company and complete relevant work with high quality and quantity. Of course, effective restraint mechanism is also very necessary, blind expansion will also bring hidden dangers to the development of the company.
In fact, the thinking of the whole company needs to be innovated, from the original culture of leaving traces and exemption to the culture of innovation and service. Operation and maintenance departments should strengthen the awareness of service, better serve customers, provide stable business development environment for business departments, smooth environmental support for development departments, and provide harmonious working environment for operation and maintenance personnel. Only when the problem of ideology is solved, can the following problems be easily solved.
Secondly, operation and maintenance personnel should pay attention to system innovation
In terms of institutional innovation, the most crucial thing is assessment. The butt determines the head, and assessment is the baton. At present, the assessment of operation and maintenance department is safe and stable operation, which is very necessary, but only stable operation is far from enough.
People are seeking advantages and avoiding disadvantages. Those who have done operation and maintenance know that a great threat to the safe and stable operation is change. As long as the change is controlled, there will be no big problems in the system operation. However, without change, there would be no improvement of competitiveness and rapid business development. How to make the operation and maintenance department actively embrace change is also a very intellectual question.
In fact, some companies have done some useful attempts, in the department assessment, safe and stable operation accounted for 60, innovation accounted for 20. Do a good stable operation can only be considered a pass, but if the stable operation problems, must be failed. Operations has to make trade-offs between stable operations and applying change. We can see that the operations department has taken a great step forward in embracing these changes.
Of course, “the power without responsibility is the devil, and the responsibility without authority is the hell”. The operation and maintenance department should carry out effective assessment on the development department and even the management department to avoid the problem of asymmetric responsibility and power.
Then there are departmental responsibilities
The operation and maintenance department is the product of the operation and maintenance r&d center. Due to tight time and heavy task, marketing pressure and other reasons, the company often carries out application development, production and online when the business needs still need to be further improved and the test is not fully in place, which brings some hidden dangers to the safe operation.
Operations staff in the security system on the basis of stable operation, actively participate in the design of the application system, testing and other work, familiar with the overall architecture of the application system and the way of implementation, using his experience suggest risk rectification and system optimization, and put these Suggestions to summarize refined into the functional specifications, for development in the process of development planning in advance. Operations personnel should also actively track industry trends and use new technical architectures to address operational and development challenges.
As a result, the project will be like a four-engine, autopilot aircraft that can automatically avoid and deal with any malfunction. In addition, operation and maintenance departments should carry out operation and maintenance big data analysis to provide detailed and reliable basis for marketing and decision-making of business departments.
Next up is the tool platform
“If a worker wants to do a good job, he must sharpen his tools first.” Operation and maintenance personnel should abandon slash-and-burn mode and focus on building an intelligent operation and maintenance platform, which can effectively manage the managed system and achieve twice the result with half the effort. The advantages and disadvantages of intelligent operation and maintenance platform have become an important indicator to measure the level of operation and maintenance management. Only through continuous exploration and accumulation on intelligent platform, can its functions and utility be strengthened, so as to realize the goal of digitalization, automation and intelligent management.
And last but not least, people
Everyone has limitations and unlimited potential. The division of responsibilities determines the content and knowledge structure of his work. In the past, operation and maintenance engineers were responsible for equipment and system installation, application environment construction, operation and maintenance support and other work, and rarely touched on development. The world is evolving and changing. Operation and maintenance engineers who cannot do development are gradually dying out. Instead, operation and maintenance development engineers write scripts and build operation and maintenance platforms to achieve operation and maintenance work. Meanwhile, being familiar with application architecture and business knowledge is of great benefit to the operation and maintenance of application systems, and is also the direction of active efforts of operation and maintenance personnel. Because the knowledge structure is relatively simple, operation and maintenance personnel often lack confidence in transformation. In fact, as long as you bravely go out, put into action, any difficulty is a paper tiger.
At present, new technical theories emerge in an endless stream and new technical architectures change with each passing day. As long as operation and maintenance personnel embrace these changes with an open mind, apply these new technologies and architectures to practical work and solve the problems and difficulties encountered in work, they can firmly control the direction of future development. We should not only adhere to the correctness and agility of tactics, but also pay more attention to the forward-looking strategy. Only when they are strong enough will operations be able to win the respect of developers and even business people without fear of external challenges. We operation and maintenance people want to say, further broad sky!