preface

A few days ago, I published a blog about the memory address alignment of C language, which caused a heated discussion. Indeed, the selection of examples in this article is not very appropriate. There are many loose points in the examples, and I got a lot of advice from students in the comments of the blog. There are also many students pointed out that this method is too technical, not suitable for the use of project development, indeed, without a deep foundation, especially the GCC compiler deep understanding, is more prone to error.

However, the purpose of this article is to introduce you to the use of memory alignment to store some information. If you see this use in someone else’s code in the future, you will not be confused. At least remember that someone introduced this use sometime before OSC. Also note that the Thinking series is based on GNU GCC and other compilers may have inconsistencies.

Today we are going to talk about the c equal operator (==), which many of you may find very strange. Why should we talk about the equal operator? If the operands are equal, return 1, otherwise return 0. If 1 == 1, 1 == 2, return 0.

Is that really the case? Let’s start with a simple example.

The sample

Again, we’ll start with a simple example, as follows:

    int a = 10;
    long b = 12;

    (void)(a == b);
    (void)(&a == &b);
Copy the code

Define two variables a and b, int and long respectively. Line 04 casts the value of an expression (a == b) to void. This line does nothing, and void is not used for rvalues. The difference is that the two operands equal to the operator change from (int,long) to (int *, long*).

Let’s turn on all the compiler alarms and compile the source program with gcc-wall to see what happens.

We see that the compilation results are a bit strange. Line 05 reports an alarm message, which reads as follows:

warning: comparison of distinct pointer types
Copy the code

An alarm is a comparison of different pointer types. The only difference is that the operand type is different. Line 04 compares (int,long), line 05 compares (int *, long*), and line 05 compares (int *, long*). Both lines of code should be alerted, because line 04 also compares different data types. Result Why is there an alarm in line 05 but no alarm in line 04?

Analysis of the

When we encounter such a problem, how should we analyze to explore it? Share my humble opinion here.

When faced with such problems, there is no doubt that source code is the best place to go and also the most complete information, the least deceptive place. We know GCC is a relatively large project that requires a solid foundation in compilation principles. At this time, we will realize the difference between undergraduate and non-undergraduate, but also from the side to the students who are still studying in the university a little warning, during the university do not waste their computer professional foundation for a little small project experience, it is very, very not worth it.

Next, I will simply talk about the analysis of ideas, from abstract to concrete, is my consistent idea. Start by understanding the GCC architecture as a whole, and then focus on the specific issues we are addressing. One of the most difficult is the locating phase of the source code in question, in which there are two cases:

1) If you are very familiar with the overall architecture and process, you can analyze the process step by step and track down the specific source code of the problem;

2) When we only know the overall architecture, we cannot immediately locate the problem source, but we can locate a superset of the problem. For the example in this article, I do not know where the problem source is at the beginning, but at least I can confirm that the problem is definitely in the AST construction phase. Once we know this, we can then use a few tricks to analyze the phase, thus reducing the size of our problem.

Here’s a little trick, how to quickly locate problems when they occur. Grep ‘warning: Comparison of distinct Pointer types’ -r *, at this time there may be multiple places to appear the string, and then we can use our previous analysis to eliminate the problem at the occurrence stage step by step, narrow the scope, and finally locate our source code.

For this article, the source information is as follows (only the source code relevant to this article’s analysis is listed below, and the rest is omitted) :

/*** * Notes: * file location: gcc/c/c-typeck.c * function: tree build_binary_op (location_t location, enum tree_code code, tree orig_op0, tree orig_op1, int convert_p) * */ case EQ_EXPR: case NE_EXPR: ... if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE || code0 == FIXED_POINT_TYPE || code0 == COMPLEX_TYPE) && (code1 == INTEGER_TYPE || code1 == REAL_TYPE || code1 == FIXED_POINT_TYPE || code1 == COMPLEX_TYPE)) short_compare = 1; . else if (code0 == POINTER_TYPE && code1 == POINTER_TYPE) { tree tt0 = TREE_TYPE (type0); tree tt1 = TREE_TYPE (type1); addr_space_t as0 = TYPE_ADDR_SPACE (tt0); addr_space_t as1 = TYPE_ADDR_SPACE (tt1); addr_space_t as_common = ADDR_SPACE_GENERIC; /* Anything compares with void *. void * compares with anything. Otherwise, the targets must be compatible and both must be object or both incomplete. */ if (comp_target_types (location, type0, type1)) result_type = common_pointer_type (type0, type1); else if (! addr_space_superset (as0, as1, &as_common)) { error_at (location, "comparison of pointers to " "disjoint address spaces"); return error_mark_node; } else if (VOID_TYPE_P (tt0)) { if (pedantic && TREE_CODE (tt1) == FUNCTION_TYPE) pedwarn (location, OPT_Wpedantic, "ISO C forbids " "comparison of %<void *%> with function pointer"); } else if (VOID_TYPE_P (tt1)) { if (pedantic && TREE_CODE (tt0) == FUNCTION_TYPE) pedwarn (location, OPT_Wpedantic, "ISO C forbids " "comparison of %<void *%> with function pointer"); } else /* Avoid warning about the volatile ObjC EH puts on decls. */ if (! objc_ok) pedwarn (location, 0, "comparison of distinct pointer types lacks a cast"); if (result_type == NULL_TREE) { int qual = ENCODE_QUAL_ADDR_SPACE (as_common); result_type = build_pointer_type (build_qualified_type (void_type_node, qual)); }}Copy the code

 

When both operands are INTEGER_TYPE (long = long int), only one line of code is used:

short_compare = 1;
Copy the code

When both operands are POINTER_TYPE, a lot of judgment is made:

comp_target_types (location, type0, type1)
Copy the code

 

addr_space_superset (as0, as1, &as_common)
Copy the code

 

VOID_TYPE_P (tt0)
Copy the code

 

VOID_TYPE_P (tt1)
Copy the code

As you can see from the above, the first function comp_target_types compares the two operand Pointers to the same data type. If neither of these conditions is met, the compiler prints an alarm:

 pedwarn (location, 0,
                                "comparison of distinct pointer types lacks a cast");
Copy the code

 

From the above we can see that GCC handles the equal operator EQ_EXPR differently for different operand types. The above source code is a good explanation of the issues raised in our example.

Note 1: Int and long may be thought of as two data types to us, but for GCC, both are INTEGER_TYPE;

Note 2: Later, there will be time to explore the GCC architecture with you in detail.

application

When both operands are of pointer type, the compiler checks the data type to which the Pointers point before comparing the values of the operands.

Given this feature of the GCC compiler, it’s natural to wonder, what is the use of this feature? Let’s look at a simple application of this feature.

#define max(x,y) ({ \ typeof(x) _x = (x); › typeof(y) _y = (y); > (void) (&_x == &_y); › › \ _x > _y? _x : _y; })Copy the code

 

This is a macro that computes the maximum value of two numbers, obtains the x and y values, and then uses the properties of our equal operator. We know that the biggest difference between macros and functions is that macros can’t do static datatype checking and are simply substitutions, so it’s hard to spot errors when they happen.

The code in line 04 compares the data types of x and y to see if they are the same. When the data types are different, the compiler alerts you. This avoids calculating the maximum value of the different data types. This way, we can use macros while making our code more robust.

We know from the above application that when we need to check whether the data types of two variables are the same, we can use the == operator to compare the data types to which the Pointers point.

conclusion

This article begins with a simple example of how GCC handles equal operator expressions differently when both operands are of pointer type. From the point of view of GCC source code again, why will appear this situation. A simple application of this feature is then introduced. From the above analysis, we know that as we learn more about our compilers, we can write more efficient and robust code.

Here fun again, rather than take the time to do something without too much significance as the c + + powerful features and syntax, etc., should spend more time to understand the system we use (here system includes a compiler, OS, processor, etc), to write efficient code, because our system is really not known to us, There’s so much to think about and explore.

reference

【 1 】 en.wikibooks.org/wiki/GNU_C_…

(2) www.airs.com/dnovillo/20…

[3] www.linux.org/