This question has been on my mind for a long time. How do you refactor well?
Programmers love refactoring. Refactoring can highlight technical strengths, optimize software performance, make software more aesthetically pleasing, and make programmers happier. The thought of turning bad code into beautiful code has made countless programmers nervous.
However, the industry is full of XXX Best Practices articles, books, and online courses, but not much systematic talk about how to refactor. The only thing I do know is Refactoring, from Martin Fowler at ThoughtWorks. And this is almost 20 years old. While much of the book has stood the test of time, it is based on Java, monolithic software, and pre-Internet business assumptions. There are many specific methods can not directly set the formula.
Reading “XXX Best Practices” is like watching short videos of beautiful men and women, except that it gives us unrealistic fantasies about the specific people in our lives. What we, as software professionals, need is a beauty tutorial. At the very least, we need a cure. The real world is not beautiful from the beginning, but through a lot of necessary work, to become beautiful.
We need to know how to turn bad code into good code. It’s not enough to know what good code is.
I was, and you may or may not be. That’s when I started working, and when I saw some bad code, I started complaining and making fun of it, even cursing the intelligence of my predecessors. Of course, I’m an action person who complains. I started my big makeover. Although there were some temporary bugs caused by the transformation, after nearly a year of unremitting reconstruction, the software finally became more enjoyable to me. However, my workload did not decrease, and the overall quality of the software did not improve fundamentally.
At this point, when the average person realizes the high cost and low return of refactoring, he or she will pack up and become a paddler. Stay focused and don’t touch code that has been taboo since ancient times.
Such awareness may make you a reliable employee (if otherwise superior), but never a peep into the road. In order to be strong in the divine Sea, one must be able to build the inner world better, and reconstruct the inner world when necessary. Easy tendon wash pulp, like a new birth.
Mankind has entered an age of complexity. Any field is so complex that it is impossible for any one person to exhaust his knowledge and complete his experience. In such an age, subtraction is far more important than addition, far more important than in any previous age.
Even if you spend all your time watching anime and movies, you can’t watch all of them. If you spend all your time traveling, you won’t be able to travel all the way.
We can’t even do something as easy as entertainment on our own. How can we have the ignorance, the ignorance, the hubris of “I can refactor this software into perfection”?
Before we know when and how to refactor, we need to know when and how not to refactor.
No matter how much love and pride we as practitioners and scientists attach to software development, we must recognize the essential difference between software engineering and science. Science is about facts, not to the will of the people, at most people at that time of the general awareness of it.
Software engineering is essentially engineering, human activity, and most of the time group human activity. Therefore, all the characteristics of group human activities are reflected in the company’s code development.
Conway summed it up in the last world: software is a reflection of how organizations communicate.
And the second law of thermodynamics tells us that a closed system will eventually go to chaos.
Someone with a half-jar of tinkling water might write an article on “XXX Best Practices” or publish an online course that sells 10,000 copies. But only true masters know what to do and what not to do, how to transcend the innate weaknesses of human group behavior to optimize a system. Only the real warriors dare to go against the current to break the ugly reality and re-carve their own beauty.
So much for the imaginary, now for the real.
When considering refactoring, we need to think about it at three levels.
- organization
- business
- code
code
Let’s start with code, the most basic thing. XXX Best Practices, which you can easily find online, basically stops there.
Although architecture is also made up of a lot of code, I deliberately distinguish this. Only code here means partial code that affects only a few businesses or a few infrastructure functions. It’s hard to tell with a specific number of rows. Sometimes 20 lines of code changes the underlying architecture, and sometimes 1000 lines of code is just one page.
Refactoring at the code level must satisfy one of these conditions:
- The same complexity, but with increased functionality.
- Same functionality, but less complexity.
- Without compromising functionality, without adding complexity, performance increases. (It has to be reduced in time or space)
Any refactoring that does not satisfy any of these requirements, such as simply changing multiple lines to one, for to while, method to function, longer variable names to shorter ones, and so on, without changing any of the code’s internal logic, is a waste of time.
I have a new guy in my company who likes to make comments like this in PR. Talking to him about code is a terrible experience. But I also understand why he had those thoughts. One reason is the low quality of XXX Best Practices online. You don’t know why.
The length of lines of code does not directly reflect the logical complexity of the code. Something 400 lines long isn’t necessarily more complex than something 200 lines long. It’s even simpler and more readable. Readability and complexity can only be judged by the number of lines if the difference is more than an order of magnitude. A thousand lines is a lot worse than a hundred lines to do the same thing.
From this point of view, it is ok to copy some code repeatedly, even best when in doubt. Repetition is the most complete decoupling.
To reduce logic complexity, we can basically follow this principle:
- Branching logic becomes single-line logic
- Crossover logic becomes unidirectional logic
- Will disperse the aggregation
Branching logic becomes single-line logic
The most common branching logic is lots of control flows, if, else, for, while, and so on. If you can reduce the number of forks while maintaining functionality, or even do no branches at all, the complexity will be greatly reduced.
Asynchronous programming is also a branch in nature. That is, a control flow forks from the main control flow. But this is more complicated than the usual if else, because of the added time and order factors.
Crossover logic becomes unidirectional logic
If two functions call each other, this creates crossover logic. A calls B, and B calls A. What’s more, multiple functions cross call, forming a complex call network. If you can make it one-way A->B->C->… the complexity is greatly reduced.
This corresponds, in asynchronous programming, to microservices that call each other, or an object that is both a producer and a consumer.
Sometimes, branching programs are unavoidable. But if you can avoid it, avoid it.
In this sense, code-level refactoring is not about what else you can do, but about cutting out what you shouldn’t be doing.
Successful code refactoring usually removes more than it adds, and makes things easier later.
Will disperse the aggregation
Here’s a recent example. I’m working on a Transpiler project. Its function is to automatically translate GraphQL into database queries. The Transpiler has three modules, each transforming the data structure of the previous one into the one needed next. A previous feature was implemented to split the construction of its data structure into three modules. This results in a feature needing to look at the code in multiple places to understand its logic. And the new feature I need to do depends on this feature. Without refactoring, I would have split the code across three modules, making development difficult.
Therefore, I chose to unify the previous functions into the second module first, so that my functions can be easily implemented in the second module. The refactoring changed nearly 600 (300+300-) lines of code, but the new feature only used about 50 lines. This is the power of a successful refactoring.
business
As mentioned above, branch – crossing programs are sometimes unavoidable. Because code can’t be any simpler than what it’s trying to do. We always add unnecessary code after we finish our business. Given unlimited time and unchanging business requirements, we could theoretically do without unnecessary code. That’s why people in the industry tend to think that people who work on underlying technologies like operating systems and databases have better skills and better code. This is not because the practitioners at the top of the application code are inherently more stupid or ignorant, but because requirements at the lower levels are more stable and change at a slower rate, so the developers involved have more time to fulfill them. The database boom of recent years, for example, is simply a better way to address a need that has existed for years.
Therefore, the technical ability of the underlying system developers is mainly based on their deep understanding and experience in the field. However, the technical ability of the upper application developers is mainly reflected in the ability to construct the code architecture that is good at changing. In this sense, a good application architect is harder to exist than a domain technologist. For they seek constancy in the midst of change, and respond to all change with constancy.
So, business changes directly affect code complexity. If you can make a small change in business or product functionality in exchange for a big optimization of your code, it’s a bargain. However, as a low-level programmer, the business side and the product side may not be willing to make this deal with you. More often than not, low-level programmers are not even allowed to participate in the discussion.
Of course, this is a matter of corporate culture and management style. Grassroots programmers have no voice, or speak not to be heard, itself is not good.
Therefore, the head of the engineering team, no matter CTO, vice president of engineering, or the engineering director of a small team, all need the necessary ability to “bargain” with the business side and the product side. “Bargaining” sounds a little hostile, but a better word is to go hand in hand.
Let’s start with smaller product features. If a company is healthy, this is the kind of small feature discussion that low-level programmers or low-level engineering executives can have a say in. Let’s say the product manager wants the technical team to do three more interfaces. The unprofessional technical team did it, and even joked about the product manager’s stupidity while doing it. However, a good technical person might find that doing these three pages directly would make the code implementation too complex to be done in time if refactoring was required to reduce this complexity. Technical people who are good at communication will talk to the product and find out what its essential needs are. Maybe it turns out you don’t need three new pages, you just need one, and the design of that one needs to be changed. If you can transfer technical problems to product design problems, don’t make them technical problems.
A lot of times, people tell you that what they need is just one of the ways that they think will meet their needs, but it’s not the only way. And in a rapidly iterating product, the approach the product manager comes up with first is often not the optimal approach. Sometimes more thought, can do a lot less. Many times it is even possible to achieve a good balance between technology implementation and product requirements.
But if you’re unlucky enough to be in an organization where technical people have no say, don’t expect business change to drive refactoring.
organization
A company’s organizational structure and code structure are two sides of the same coin. Major refactorings often involve changes at the organizational level. This level of refactoring is not something that the average low-level programmer would normally touch.
Domestic students are familiar with, such as the popular technology platform a few years ago, which is not only a reconstruction of technology, but also a major adjustment that directly affects the behavior of the entire organization. The most cited of these is probably Ali’s tech center. Technology can make previously impossible forms of organization possible, but conversely, solidified forms of organization can limit the development of technology.
If you start working at this level, you can’t just talk about technology and business. In other words, the person dealing with this level of work must be able to look at technology, organization, business and other issues comprehensively and dialectically.
However, in some special cases, grassroots technical staff may also encounter technical barriers due to the organizational structure. At this point, think about whether the way you think about refactoring has the same barriers.
It was the summer of 2018, and I was working on an API middleware team at an e-commerce company. The team maintained a GraphQL development framework. Readers who are not familiar with GraphQL can understand it as an RPC technology.
In general, it is the back-end developers who provide GraphQL for the front-end to use. Because GraphQL is essentially a back-end technology, the code runs on the server. Many people in the industry mistakenly think of GraphQL as a front-end technology because there are so many JS client tools.
But in that e-commerce, it became the front-end developer to write GraphQL. What’s more, our team wrote GraphQL API to connect to every business logic. This is really a bad development approach, in the sense that the engineering team, as a system, has found a Local Max solution that is not globally optimal, but cannot automatically adjust. I was doing a lot of work trying to get the back end to write GraphQL or at least get the front end to write more GraphQL itself, instead of us being the maintainers of the framework and also doing the business logic development. But it didn’t work out. That’s because I didn’t realize the organizational barriers.
The e-commerce company started as a traditional PHP template site. Later came backend programs such as Java. But separating HTML web pages from back-end business logic was a work that only started in 2015. So JS engineer is a job category that didn’t appear until 2015. Because it is e-commerce, so search engine optimization is very important, so SSR (server rendering) is necessary. As a result, although the front end is a single application written with React, part of the logic is actually run on the server side. It’s just that the code is deliberately vague at the beginning of the architecture about which code runs when rendered on the server and which code runs only in the browser. Doing so, at least the argument at the time was, would take the SSR off the minds of business logic developers.
At 18, the most mature GraphQL solution was JS. At that time, the e-commerce company did not have good server operation and maintenance capabilities, so in order to make GraphQL, our team had to embed our class library into the front-end code base, and then run it in the SSR stage, that is, run it in the front-end rendering server. From a back-end business developer’s perspective, it’s natural for them to think that since all your code is in the front-end repository, it’s natural for the front-end developer to write it.
This was the biggest obstacle to GraphQL refactoring at the time. Many companies’ engineering teams are based on code bases. This code is my turf, and that code is your turf. It’s a natural human concept. To ignore this reality and promote something for granted is, of course, asking for trouble.
Some naive programmer would say to me, “If I had a company from now on, I would avoid such territorial people.” As long as these people were doing this and doing this and doing that, we wouldn’t have this problem.
This idea is no more instructive than “if the world were full of clean officials, there would be no corruption”.
The question is, how can refactoring work if 90% of programmers are so territorial?
All restructuring should first recognize the reality of the organization, and then comply with human nature, so that it is possible to mobilize the enthusiasm of others to accomplish your goals.
Of course, in the specific case of that e-commerce company, only a CTO-level person would have enough influence to reconstruct it. It’s just that they need to refactor a lot more than our team.
So if you find yourself in a similar situation as a grassroots, don’t refactor. Not refactoring can make your work life extremely painful and distasteful, and it’s best to do it. Effective refactoring only happens to the extent that you can influence it.
conclusion
This topic is very huge. It involves the specific writing of the code on the micro level, and the structure form of the organization on the macro level. Subjectively, it depends on people’s aesthetics and cognition, and objectively depends on the actual needs and resources.
Hopefully this article has given you some insight into what can be refactored, what should be refactored, what should not be refactored, and what doesn’t work when refactored.