When defining variables in programming practice, there are only two things you can control: the type and the name of the variable. To some extent, these two tests are the developer’s math level and language level respectively. Today, even with very sophisticated type systems, “misnamed” variable names can often be a source of frustration for developers. So, what theory can we use to guide variable naming?

In the early days of computer science, there was no clear line between variable names and variable types. For example, the famous “Hungarian nomenclature” advocates putting attributes + types + descriptions into variable names: const int Max, for example, corresponds to c_i_max. As programming languages evolve today, this practice naturally seems awkward. For mainstream programming languages today, it’s safe to assume that variable names in them should be pure “descriptions of variables.”

So, how do you describe a variable? At this moment, our focus has transferred from the type theory in natural science to the humanities in the literary creation:) at the named variable, we actually need to do is simply describe an abstract thing, it is not easy, “name” but evil things in the category of natural language, If you can write a bug-free red-black tree on a whiteboard, have you ever written a high school essay with full marks in your native language?

So, natural languages actually play a very important role in programming — code is written for people, after all. I believe that every normal reader will subconsciously understand code logic in the natural language they grew up with, and then consider the details in terms of the grammar rules of a programming language they were taught in class yesterday. Here, for example, some programming languages that purported to be concise but full of notations are a counterexample: the code they produce may be very friendly to a Geek who loves to derive formulas, but the average person may at first glance look like geeky. Of course, there are examples of overcorrection: for languages that encourage extremely long variable names, the code that maintains them… A little bit of nose holding.

At this point, we have summarized some of the subtleties of variable naming:

  • Variable naming is separate from the type system.
  • Variables are named in natural language.
  • When naming is too succinct, the readability is poor.
  • When naming is too long, the readability is poor.

Does that sound subjective and hard to quantify? Which brings us to our topic: naming is hard to quantify, but you can sort out some general rules based on natural language syntax 🙂

Grammar was one of the things I hated in high school. You could do almost everything without learning it. If you learn it, you may get confused and set the wrong scene… However, syntax is much like the type system of a natural language, and is subtly related to the corresponding concepts in a programming language. Keywords in JavaScript, for example, can be roughly classified as follows:

nounfunctionVar class verb importexport extends return break continueDelete switch new try catch throw yield prepositionfor in elseconjunctionsif whilePronouns thisCopy the code

Notice that there are significantly more verbs than nouns? Furthermore, programming languages have many function words such as prepositions/conjunctions to express control flow. HTML, by contrast, is all about nouns: eat my head, body and form button! HTML is nothing more than a stack of nouns, with fewer maintenance problems associated with naming than programming languages with complex logic. So what are some natural language rules that help improve readability and maintainability when maintaining code in a programming language? Here the author summarizes the following points:

  • Note the semantic matching of names and types
  • Maintain conceptual consistency
  • Avoid creating structures that are counter-intuitive to language
  • Use broad abstractions with caution

Let’s look at them one by one: -d

Note the semantic matching of names and types

As we already know, there is a subtle relationship between the part of speech of a word and the type of variable. Keeping this relationship matched when naming improves readability. Note that this does not write the type to the variable name as Hungarian nomenclature does, but instead selects the variable name that fits better according to the type:

  • For variables of Boolean type, they are usually named likeisSomething / hasSomethingIn the form. This is actually very similar to statements in natural language — statements have positive and negative forms, which imply Boolean valuestrue å’Œ false.
  • For variables of array type, the names are usually plural. For example,applesIn natural language, there is more than oneapple, which is very consistent with the array thinking model.
  • For variables of object type, they are often named in nominal phrases. For example,UserModelClass like this. Remember, both nouns and verbs are content words, but one is more “object-oriented” and the other more “functional.”
  • For functional variables, the name is usually a verb, or more like an imperative. A function should explicitly “do something”, when “do what you say” is much simpler and more sincere. For example, you could try to prefix all of your class methods with PleasegetUserData()It’s going to be seamlessPlease get user dataThe imperative sentence of. Of course, in functional programming, function – oriented play out of all kinds of tricks, but also can use a variety of function words to whitewash. Such aswithStateSuch a name implies the ability to “embellish” parameters. In all kinds of callback scenarios, prepositions like on/before/after are also common to indicate the trigger time of the corresponding action.

Different programming languages and paradigms can vary dramatically in terms of vocabulary. Java, for example, is a realm of nouns, whereas first-class citizen languages have verbs in a much higher position. The style advice here is to speak according to the rules of the place, so that you can have a good XD wherever you go

Maintain conceptual consistency

With formatting tools like Lint, newline indentation is generally uniform in projects, which really eliminates the squiggly feel of code formatting. However, this is still common in many projects:

  • Three similar operations are encapsulated separatelyloadUser / fetchUser / getUserThree different functions, they all seem to be the same thing, but dare you switch them around?
  • The above aelements.forEach(element)And here it iselements.forEach(elem). You can have an indexi / idx / inx / indFour different abbreviations 🙂
  • Private variables in existing modules are as follows$xxxThe new code comes up with a format declaration__xxx.

In fact, these problems alone, it is hard to call serious, corresponding solutions should also be platitudes. But if such inconsistencies pervade a project’s code base, no amount of neat space indentation can hide the clutter. A non-technical tip here is to set up Code Review mechanisms: these issues are difficult to control with Lint tools alone, and it’s not difficult to make Review comments and changes. The key is the process:

  • It is difficult for newcomers to understand all kinds of implicit agreements at the beginning, and many old members are familiar with the practice, but it is an unfamiliar pit for newcomers.
  • Once the code is incorporated into the trunk, the cost of modifying it increases dramatically — who is responsible for the bad refactoring when the old code runs fine?

Of course, there are many other benefits of Code Review that won’t be covered here.

Avoid creating structures that are counter-intuitive to language

We’ve already mentioned that intuition plays a subtle role in coding. So what is counter-intuitive? Such small places as this:

  • Double negatives are positive, so should we be comfortable with 2N negatives? Of course machines understand it correctlyif (! dataNotLoaded ! == false)“, but I’m afraid it’s easy to get yourself and others into it, especially in and or conditions :-p
  • It is common practice to “encapsulate” data, but if this occursdata.dataHow do I determine what is used in other codedataWho are the variables?
  • Forms of control flow such as deep nesting and fancy jumps are also counter-intuitive, but this goes beyond variable naming XD

Use broad abstractions with caution

Thanks to the rich abstractions of natural language, there are always some variable names that are very lazy. For example, if you can’t figure out a name for a variable that holds data, call it data. If a Base class doesn’t know what to call it, call it Base.

In fact, if you use such a broad concept because you “didn’t figure out what to call it,” you’re implicitly in technical debt for “not having a clear and correct design.” For example, when extending or refactoring, you might encounter problems like this:

  • Everywhere,data, so that not only can not easily find and replace to refactor, rely on the IDE to change the name of a variable have to be nervous, uncertain scope.
  • All kinds of shape such ascore / base / commonIt’s hard to tell where the boundaries are: what core does, what side effects do I have from inheriting it, and should I put my new functions in it?

For variables that hold data, names like data/item are just as good as unwritten — of course, this is fine for utility functions that can be reused. The actions of functions are relatively easy to describe, but if you want to write all callbacks as callback, that’s fine…

In fact, many of the above problems can be found in introspection as long as you double check the process before submission. This is also a very important skill: figuring out what you’re not doing well is progress in itself.

conclusion

The process of abstracting concepts is often one of the most fun parts of programming, and good variable naming helps to express the thought process more clearly. In this view, the difficulty of naming variables is not directly related to the strength of the type and the dynamic and static state of the type system: the language of a math master is not necessarily good enough, and vice versa. We also suggest some coding practices based on some very simple rules in natural language. However, this article can also provide you with a strong counterargument the next time you come across a diss review suggestion for naming your variables:

It’s hard to fully understand the type systems of programming languages, let alone natural languages.