The main cause of this article was a discussion with my friends on Weibo about the code refactoring of nested if-else statements with several layers. There were all kinds of questions and thoughts on Weibo. According to the truth, these are the basic skills of programming, it seems not worth writing an article, but I think a lot of things can start from a simple thing, to the essence, so I think it is necessary to write an article here. Not all of them. I just hope to get more discussion, because progress can only be made with more in-depth discussion.
It’s a bit long, so I’ll give you some thoughts and a summary at the end, and you can skip to the end.
Arrow code is basically what the following image shows.
So what’s wrong with this “arrow” code? It looks good, too, with symmetry. But…
There are a few questions about arrow code:
1) My monitor is not wide enough, and the indented arrow code is too hard, which requires me to pull the horizontal scroll bar back and forth, which makes me feel uncomfortable when reading the code.
2) There is length in addition to width. There is so much if-else code in if-else code that you don’t know what layers of code went through in the middle to get there.
To sum up, the arrow type code “if too many nested, code is too long, will be quite easy to maintain code (including yourself) lost in the code, because I saw the innermost code, you no longer know in front of the layers and layers of condition is what kind of judgment, how code is running here, so, Arrow code is very difficult to maintain and Debug.
Cases on Weibo and Guard against Clauses
OK, let’s take a look at the example on Weibo. If you have more code and more nesting, you can easily get lost in the condition (the example below is just a small arrow under the “big arrow”).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
FOREACH(Ptr<WfExpression>, argument, node->arguments) {
int index = manager->expressionResolvings.Keys().IndexOf(argument.Obj());
if (index != -1) {
auto type = manager->expressionResolvings.Values()[index].type;
if (! types.Contains(type.Obj())) {
types.Add(type.Obj());
if ( auto group = type->GetTypeDescriptor()->GetMethodGroupByName(L "CastResult" . true )) {
int count = group->GetMethodCount();
for ( int i = 0; i < count; i++) { auto method = group->GetMethod(i);
if (method->IsStatic()) {
if (method->GetParameterCount() == 1 &&
method->GetParameter(0)->GetType()->GetTypeDescriptor() == description::GetTypeDescriptor<DescriptableObject>() &&
method->GetReturn()->GetTypeDescriptor() ! = description::GetTypeDescriptor< void >() ) {
symbol->typeInfo = CopyTypeInfo(method->GetReturn());
break ;
}
}
}
}
}
}
}
|
The above code can be written in reverse, and then the arrow type code can be unwound, the refactoring code is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
FOREACH(Ptr<WfExpression>, argument, node->arguments) {
int index = manager->expressionResolvings.Keys().IndexOf(argument.Obj());
if (index == -1) continue ;
auto type = manager->expressionResolvings.Values()[index].type;
if ( types.Contains(type.Obj())) continue ;
types.Add(type.Obj());
auto group = type->GetTypeDescriptor()->GetMethodGroupByName(L "CastResult" . true );
if (! group ) continue ;
int count = group->GetMethodCount();
for ( int i = 0; i < count; i++) { auto method = group->GetMethod(i);
if (! method->IsStatic()) continue ;
if ( method->GetParameterCount() == 1 &&
method->GetParameter(0)->GetType()->GetTypeDescriptor() == description::GetTypeDescriptor<DescriptableObject>() &&
method->GetReturn()->GetTypeDescriptor() ! = description::GetTypeDescriptor< void >() ) {
symbol->typeInfo = CopyTypeInfo(method->GetReturn());
break ;
}
}
}
|
This method of refactoring code is called Guard Clauses
- Martin Fowler’s Refactoring website describes this as Replace Nested Conditional with Guard Clauses.
- There was also a Flattening Arrow Code article on Coding Horror that talked about this kind of refactoring
- StackOverflow also has related questions that say this way.
The idea here is to let the bad code come back first, make all the bad judgments out of the way, and then leave the normal code behind.
Call waiting welfare
1. Recently sorted out 20G resources, including product/operation/test/programmer/market, etc., and Internet practitioners [necessary skills for work, professional books on the industry, precious books on interview questions, etc.]. Access:
-
Scan the code of wechat to follow the public account “Atypical Internet”, forward the article to the moments of friends, and send the screenshots to the background of the public account to obtain dry goods resources links;
2. Internet Communication Group:
-
Pay attention to the public account “atypical Internet”, in the background of the public account reply “into the group”, network sharing, communication;
Extract as a function
Weibo, some say, the continue statement destroyed the reading of the code, I don’t think they must have read the code here, in fact, we can see that all the if statement is, in the case of judgment whether the error, so, in the maintenance of the code, you can completely ignore the if statement, because is error handling, The rest of the code is normal functional code, which is easier to read. Of course, there must be cases where this is not the case in the code above, so can we refactor without continuing?
Of course you can, take the function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
bool CopyMethodTypeInfo( auto &method, auto &group, auto &symbol)
{
if (! method->IsStatic()) {
return true ;
}
if ( method->GetParameterCount() == 1 &&
method->GetParameter(0)->GetType()->GetTypeDescriptor() == description::GetTypeDescriptor<DescriptableObject>() &&
method->GetReturn()->GetTypeDescriptor() ! = description::GetTypeDescriptor< void >() ) {
symbol->typeInfo = CopyTypeInfo(method->GetReturn());
return false ;
}
return true ;
}
void ExpressionResolvings( auto &manager, auto &argument, auto &symbol)
{
int index = manager->expressionResolvings.Keys().IndexOf(argument.Obj());
if (index == -1) return ;
auto type = manager->expressionResolvings.Values()[index].type;
if ( types.Contains(type.Obj())) return ;
types.Add(type.Obj());
auto group = type->GetTypeDescriptor()->GetMethodGroupByName(L "CastResult" . true );
if (! group ) return ;
int count = group->GetMethodCount();
for ( int i = 0; i < count; i++) { auto method = group->GetMethod(i);
if (! CopyMethodTypeInfo(method, group, symbol) ) break ;
}
}
.
.
FOREACH(Ptr<WfExpression>, argument, node->arguments) {
ExpressionResolvings(manager, arguments, symbol)
}
.
.
|
You find that the code is easier to read and easier to maintain than it was before. Isn’t it?
Someone said, “If code isn’t shared, don’t extract it as a function!” Those who hold this view are too obsessed with reading. Functions are encapsulation or abstraction of code and are not necessarily used for code sharing. Functions mask details and allow other code to be coupled to interfaces rather than detailed implementations, which makes our code simpler. Simple things are easy to read and maintain. That’s what functions do.
Nested code outside of if
Another weibo user asked how the original code should be refactored if there is more code to execute after each if statement. Such as the following code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
for (....) {
do_before_cond1()
if (cond1) {
do_before_cond2();
if (cond2) {
do_before_cond3();
if (cond3) {
do_something();
}
do_after_cond3();
}
do_after_cond2();
}
do_after_cond1();
}
|
The do_after_condX() in the above code is executed whether the condition succeeds or not. So, our flattened code looks like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
for (....) {
do_before_cond1();
if ( !cond1 ) {
do_after_cond1();
continue
}
do_after_cond1();
do_before_cond2();
if ( !cond2 ) {
do_after_cond2();
continue ;
}
do_after_cond2();
do_before_cond3();
if ( !cond3 ) {
do_after_cond3();
continue ;
}
do_after_cond3();
do_something();
}
|
You’ll notice that there are two copies of do_after_condX above. If the code in the if statement block changes the state of some do_after_condX dependencies, this is the final version.
However, if they have no previous dependencies, we can keep only one, according to the DRY principle, and then drop directly to the if condition, as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
for (....) {
do_before_cond1();
do_after_cond1();
if ( !cond1 ) continue ;
do_before_cond2();
do_after_cond2();
if ( !cond2 ) continue ;
do_before_cond3();
do_after_cond3();
if ( !cond3 ) continue ;
do_something();
}
|
At this point, you might say, wow, I changed the order of execution and put conditions after do_after_condX(). Is that gonna be a problem?
In fact, if you look at the previous code again, you will see that cond1 originally determines whether do_before_cond1() is wrong and only executes if it succeeds. Do_after_cond1 () is executed anyway. Logically, do_after_cond1() has nothing to do with do_before_cond1(), whereas cond1 has everything to do with do_before_cond2(). If I make the line break look like this, the code logic will be clearer.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
for (....) {
do_before_cond1();
do_after_cond1();
if ( !cond1 ) continue ; // < cond1 becomes the condition for the second block
do_before_cond2();
do_after_cond2();
if ( !cond2 ) continue ; // < cond2 becomes the condition for the third block
do_before_cond3();
do_after_cond3();
if ( !cond3 ) continue ; //< cond3 becomes the condition for the fourth block
do_something();
}
|
As a result, when maintaining code in the future, the maintainer will know at a glance when the code will be executed and where it will be. At this point, you will find that the code is much cleaner if you break these blocks into functions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
bool do_func3() {
do_before_cond2();
do_after_cond2();
return cond3;
}
bool do_func2() {
do_before_cond2();
do_after_cond2();
return cond2;
}
bool do_func1() {
do_before_cond1();
do_after_cond1();
return cond1;
}
// for-loop you can recreate this
for (...). {
bool cond = do_func1();
if (cond) cond = do_func2();
if (cond) cond = do_func3();
if (cond) do_something();
}
// for-loop can also be reconstituted like this
for (...). {
if (! do_func1() ) continue ;
if (! do_func2() ) continue ;
if (! do_func3() ) continue ;
do_something();
}
|
Above, I gave you two versions of for-loop. Which one do you prefer? I like the second one. At this point, because the code inside the for-loop is so simple, even if you don’t like continue, it’s already cheap to read.
Status check nesting
Next, let’s look at another example. The following code forges a scenario where two people are in a one-on-one chat room. The code might be written as “arrows” to check their status.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
int ConnectPeer2Peer(Conn *pA, Conn* pB, Manager *manager)
{
if ( pA->isConnected() ) {
manager->Prepare(pA);
if ( pB->isConnected() ) {
manager->Prepare(pB);
if ( manager->ConnectTogther(pA, pB) ) {
pA->Write( "connected" );
pB->Write( "connected" );
return S_OK;
} else {
return S_ERROR;
}
} else {
pA->Write( "Peer is not Ready, waiting..." );
return S_RETRY;
}
} else {
if ( pB->isConnected() ) {
manager->Prepare();
pB->Write( "Peer is not Ready, waiting..." );
return S_RETRY;
} else {
pA->Close();
pB->Close();
return S_ERROR;
}
}
//Shouldn't be here!
return S_ERROR;
}
|
Refactoring the above code, we can first analyze the above code, it shows that the above code is “connected” and “unconnected” to make a combination of “state” (note: the actual state should be more complicated than this, there may be “disconnected”, “error”…… So, we can write the code as follows, merging the above nested conditions and making a judgment for each combination. In this way, the logic is very clean and clear.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
int ConnectPeer2Peer(Conn *pA, Conn* pB, Manager *manager)
{
if ( pA->isConnected() ) {
manager->Prepare(pA);
}
if ( pB->isConnected() ) {
manager->Prepare(pB);
}
// pA = YES && pB = NO
if (pA->isConnected() && ! pB->isConnected() ) {
pA->Write( "Peer is not Ready, waiting" );
return S_RETRY;
// pA = NO && pB = YES
} else if ( !pA->isConnected() && pB->isConnected() ) {
pB->Write( "Peer is not Ready, waiting" );
return S_RETRY;
// pA = YES && pB = YES
} else if (pA->isConnected() && pB->isConnected() ) {
if (! manager->ConnectTogther(pA, pB) ) {
return S_ERROR;
}
pA->Write( "connected" );
pB->Write( "connected" );
return S_OK;
}
// pA = NO, pB = NO
pA->Close();
pB->Close();
return S_ERROR;
}
|
Extension of thinking
In general, if-else statements check for two things: errors and status.
Check for errors
Using Guard Clauses is a standard solution for checking for errors, but there are a few things to be aware of:
1) Of course, when something goes wrong, there will be situations where resources need to be released. You can use Goto Fail; This approach, but the most elegant approach is the C++ object-oriented RAII approach.
2) Returning with error codes is a relatively simple way, which has some problems. For example, if there are too many error codes, it will be very complicated to identify the wrong codes. In addition, the normal codes and the wrong codes will be mixed together, affecting the readability. So, in higher-group languages, try-catch catches make code more readable.
Check the status
There must be more complicated cases for checking state, such as the following:
1) State changes of both ends in TCP protocol.
2) Various combinations of command options like shell commands.
3) Like state changes in the game (a very complex state tree).
4) State changes like grammatical analysis.
For these complex state changes, you need to define a state machine, or a query table for a combination of sub-states, or a state query analysis tree.
Control state or business state is one of the most important things that can mess up your code flow when writing code, and one of the most important things to do in refactoring “arrow” code is to reorganize and describe the relationship between these states.
conclusion
So, to summarize, there are several ways to refactor “arrow” code:
1) Use Guard Clauses. As much as possible, let the error return first, so that you get clean code later.
2) Extract the statement blocks from the conditions into functions. Someone said, “If code isn’t shared, don’t extract it as a function!” Those who hold this view are too obsessed with reading. Function code encapsulation or abstract, is not necessarily used as code sharing, function is used to shield the details, let the other code coupling in interfaces rather than implementation details, which makes our code more simple, simple things can make people easy to read and easy to maintain, write person of sell one’s own things is easy to read easy maintenance code is the original intention of refactor the code!
3) For error handling, use try-catch exception handling and RAII mechanism. Error handling of return codes has many problems, such as: A) return codes can be ignored, B) error handling code is mixed with normal code, and C) it causes function interface contamination, such as bad functions like atoi() that share error codes and return values.
4) For the judgment and combination of multiple states, if complex, can use “combined state table”, or state machine plus Observer state subscription design pattern. Such code is decoupled, clean and simple, and also highly extensible.
5) Refactoring “arrow” code helps you reorganize all your code and logic, and is well worth the effort. The process of rethinking the code in any way possible can be a growth process.