preface
Writing clean code is the goal of every programmer. Clean Code says that in order to write good code, you need to know what is dirty code and what is clean code. Then, with a lot of deliberate practice, you can really write clean code.
WTF/min is the only measure of code quality, and Uncle Bob calls bad code wading in his book, which only highlights the fact that we are victims of bad code. A more appropriate domestic term is “shit mountain,” which is less elegant but more objective, where programmers are both victims and perpetrators.
The book offers a summary of the gurus on what constitutes clean code:
-
Bjarne Stroustrup: Elegant and efficient; Be direct; Reduce dependence; Only do one thing well
-
Grady Booch: Simple and straightforward
-
Dave Thomas: Readable, maintainable, unit tested
-
Ron Jeffries: No repetition, no single role, Expressiveness
Expressiveness is my favorite description, and this word seems to express the essence of good code: description of code functions in a simple and direct manner, no more, no less.
The Art of Naming
Frankly, naming is a difficult thing to do, and it takes a lot of work to come up with the right one, especially when English is not our native language. But it’s worth it. Good naming makes your code more intuitive and expressive.
Good naming should have the following characteristics:
Worthy of the name
Good variable names tell you: what is it, why is it there, and how to use it
If you need to explain variables through annotations, you have to be a little less literal first.
Here is an example code from the book that shows how naming can improve code quality
# bad code
def getItem(theList):
ret = []
for x in theList:
if x[0] == 4:
ret.append(x)
return ret
# good code
def getFlaggedCell(gameBoard):
' ''Minesweepers, flagged: Flip'' '
flaggedCells = []
for cell in gameBoard:
if cell.IsFlagged():
flaggedCells.append(cell)
return flaggedCells
Copy the code
To avoid misleading
-
Don’t cry Wolf
-
Do not overwrite common abbreviations
I have to make fun of a code I saw two days ago that uses l as a variable name. Also, user is a list.
Meaningful distinctions
Code is written for machines to execute as well as for people to read, so concepts must be differentiated
# bad
def copy(a_list, b_list):
pass
# good
def copy(source, destination):
passCopy the code
Use the words you read
If you can’t pronounce the name, you’ll look like a silly bird when discussing it
Use searchable names
The length of the name should correspond to its scope size
Avoid mental mapping
Write a temp in your code, for example, and the reader will have to translate the word into its true meaning every time he sees it
annotation
Expressive code does not need comments.
❝
The proper use of comments is to compensate for our failure to express ourself in code.
❞
The proper role of comments is to compensate for our failure to express intent in code, which may sound frustrating, but it is. The truth is in The code. Annotations are only secondary information. The biggest problem of annotations is their unsynchronization or equivalence.
The book gives a very graphic example of this: explain in code, not in comments
bad
// check to see if the employee is eligible for full benefit
if ((employee.flags & HOURLY_FLAG) && (employee.age > 65))
good
if (employee.isEligibleForFullBenefits())Copy the code
Therefore, when you want to add comments, think about whether you can change the naming, or change the abstraction level of the function (code) to show the intent of the code.
Of course, don’t throw the baby out with the bath water. Here are some good notes
-
Legal information
-
Comments on intent. Why do you do that
-
warning
-
TODO comment
-
Magnify the importance of the seemingly irrational
Among them, I most agree with the second and fifth points, what to do is easy to express by naming, but why to do it is not intuitive, especially when it comes to expertise and algorithms. In addition, some code that feels “less elegant” at first may have a special purpose, so it should be commented as to why, for example, some code readability may be sacrificed to improve critical path performance.
The worst kind of comment is an out-of-date or incorrect comment, which is a huge disservice to the maintainer of your code (maybe yourself months later), but there is no easy way to keep your code in sync with comments other than code review.
function
The single responsibility of a function
A function should only do one thing, and that should be clearly indicated by the function name. The answer is simple: see if the function can split into another function.
The function either does something do_sth or queries something query_sth. The worst part is that the function name says it only query_STH, but it actually do_sth, which causes side effects. Take the example in the book
public class UserValidator {
private Cryptographer cryptographer;
public boolean checkPassword(String userName, String password) {
User user = UserGateway.findByName(userName);
if(user ! = User.NULL) { String codedPhrase = user.getPhraseEncodedByPassword(); String phrase = cryptographer.decrypt(codedPhrase, password);if ("Valid Password".equals(phrase)) {
Session.initialize();
return true; }}return false; }}Copy the code
The abstraction level of a function
Each function has one level of abstraction. The statements in the function must be at the same level of abstraction. Different levels of abstraction cannot be placed together. For example, if we want to put an elephant in a refrigerator, it would look something like this:
def pushElephantIntoRefrige():
openRefrige()
pushElephant()
closeRefrige()Copy the code
The three lines of code in the function describe three sequentially related steps at the same level to complete the task of putting the elephant in the refrigerator. Obviously, the pushElephant step may contain many substeps, but at the pushElephantIntoRefrige level, you don’t need to know too much detail.
When we want to learn about a new project by reading the code, we usually take a breadth-first approach, reading the code from top to bottom, first understanding the overall structure, and then diving into the details of interest. Without a good abstraction of the implementation details (and the ability to condense them into a function worthy of the name), the reader can easily get lost in the sea of details.
In a way, this is similar to the pyramid principle
Each level serves to demonstrate the views of the one above it and needs the support of the one below; Multiple arguments at the same level need to be ordered in some logical relation. PushElephantIntoRefrige is the central argument, supported by multiple sub-steps that have a logical sequence between them.
Function parameters
The more arguments a function takes, the more input cases it can combine, the more test cases it needs, and the more likely it is to go wrong.
I can sympathize with the fact that output parameters are hard to understand compared to return values. Output parameters are simply not intuitive. From the point of view of the function caller, the return value is obvious at a glance, while the output parameters are difficult to identify. Output parameters usually force the caller to check the function signature, which is unfriendly.
Passing a Boolean to a function (called Flag Argument in the book) is usually not a good idea. This is especially True when True or False is passed in to behave not as two sides of the same thing, but as two different things. This clearly violates the single-responsibility constraint of functions, and the solution is simple: use two functions.
Dont repear yourself
At the function level, which is easiest and most intuitive to reuse, many ides have trouble refactoring a function from a piece of code.
In practice, however, there are situations where a piece of code is used in more than one method, but not exactly the same, and if abstracted into a generic function, then arguments, if else, are required. It’s a little awkward, it looks reconfigurable, but it’s not perfect.
Part of the problem is that the code also does more than one thing in violation of the single responsibility principle, which makes it hard to reuse. The solution is to subdivide the method to make it easier to reuse. You can also consider the Template method to handle the differences.
test
It is a shame that testing (especially unit testing) has not been given enough attention and TDD has not been tried in the projects I have worked on. It is the absence that makes good testing all the more valuable.
We often say that good code needs readability, maintainability and extensibility, and good code and architecture need constant reconstruction and iteration. However, automated testing is the foundation to ensure all these. Without high-coverage, automated unit testing and regression testing, no one dares to modify the code, but only allows it to rot.
Even when unit tests are written for core modules, they are often taken lightly as test code, not production code, and as long as they work. This leads to very poor readability and maintainability of the test code, which in turn makes it difficult to update and evolve along with the production code, and ultimately leads to test code failure. So, dirty tests are the same thing as no tests.
Therefore, test the three elements of code: readability, readability, and readability.
The principles and criteria for testing are as follows:
-
You are not allowed to write any production code unless it is to make a failing unit test pass. Don’t write any functional code without testing it
-
You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures. Only write tests that exactly represent a failure
-
You are not allowed to write any more production code than is sufficient to pass the one failing unit test. Write only functional code that happens to pass the tests
FIRST criteria for testing:
-
Fast tests should be Fast enough to be as automated as possible.
-
Independent tests should be Independent. Don’t rely on each other
-
Repeatable tests should be Repeatable in any environment.
-
Self-validation tests should have bool outputs. Don’t look at the log as an inefficient way to determine whether a test has passed
-
Timely tests should be written in time, before their corresponding production code