Summary: This article breaks the stereotype of the command line, deconstructs and reorganizes the concept of the command line, and develops a powerful but extremely simple command line parsing method. This method supports any number of subcommands, optional and mandatory parameters, and provides default values for optional parameters. Configuration files, environment variables, and command line parameters can be used at the same time. Configuration files, environment variables, and command line parameters have higher priorities.

The author | g | ali technology, public knowledge sources

An overview of

Command-line parsing is a technique that almost every back-end programmer uses, but these niceties are trivial compared to business logic, and if you just pursue simple requirements, command-line processing is simpler and any back-end programmer can handle it at his fingertips. The Go standard library provides the Flag library for you to use.

However, when we want to enrich our command line slightly, things start to get more complicated. For example, we need to think about how to deal with optional and required options, how to set default values for optional options, how to deal with subcommands and subcommands of subcommands, how to deal with arguments to subcommands, and so on.

At present, cobra is the most widely used and powerful command line parsing library in Go. However, its rich functions make COBRA more complex than flag. In order to reduce the complexity of use, Cobra even provides the function of code generation, which can automatically generate the skeleton of the command line. However, automatic generation saves development time and makes the code less intuitive.

This article breaks the stereotype of the command line, deconstructs and reorganizes the concept of the command line, and develops a powerful but extremely simple command line parsing method. This method supports any number of subcommands, optional and mandatory parameters, and provides default values for optional parameters. Configuration files, environment variables, and command line parameters can be used at the same time. Configuration files, environment variables, and command line parameters have higher priorities.

Existing command line parsing methods

The Go standard library Flag provides a very simple command line parsing method. After defining command line parameters, you only need to call flag.parse.

// demo.go var limit int flag.IntVar(&limit, "limit", 10, "the max number of results") flag.Parse() fmt.Println("the limit is", $go run demo. Go the limit is 10 $go run demo. Go -limit 100 the limit is 100Copy the code

As you can see, the flag library is very simple to use. Once the command line parameters are determined, just call flag.parse to Parse the parameters. When defining a command line parameter, you can specify a default value and instructions for how to use the parameter.

If flag is unable to handle subcommands, it can choose to resolve the subcommands itself, but it is more likely to use cobra directly.

Here is a demonstration of how to use this library using the cobra official example

package main

import (
  "fmt"
  "strings"

  "github.com/spf13/cobra"
)

func main() {
  var echoTimes int

  var cmdPrint = &cobra.Command{
    Use:   "print [string to print]",
    Short: "Print anything to the screen",
    Long: `print is for printing anything back to the screen.
For many years people have printed back to the screen.`,
    Args: cobra.MinimumNArgs(1),
    Run: func(cmd *cobra.Command, args []string) {
      fmt.Println("Print: " + strings.Join(args, " "))
    },
  }

  var cmdEcho = &cobra.Command{
    Use:   "echo [string to echo]",
    Short: "Echo anything to the screen",
    Long: `echo is for echoing anything back.
Echo works a lot like print, except it has a child command.`,
    Args: cobra.MinimumNArgs(1),
    Run: func(cmd *cobra.Command, args []string) {
      fmt.Println("Echo: " + strings.Join(args, " "))
    },
  }

  var cmdTimes = &cobra.Command{
    Use:   "times [string to echo]",
    Short: "Echo anything to the screen more times",
    Long: `echo things multiple times back to the user by providing
a count and a string.`,
    Args: cobra.MinimumNArgs(1),
    Run: func(cmd *cobra.Command, args []string) {
      for i := 0; i < echoTimes; i++ {
        fmt.Println("Echo: " + strings.Join(args, " "))
      }
    },
  }

  cmdTimes.Flags().IntVarP(&echoTimes, "times", "t", 1, "times to echo the input")

  var rootCmd = &cobra.Command{Use: "app"}
  rootCmd.AddCommand(cmdPrint, cmdEcho)
  cmdEcho.AddCommand(cmdTimes)
  rootCmd.Execute()
}
Copy the code

You can see that the addition of subcommands makes the code slightly more complex, but the logic is still clear, and the subcommands follow the same definition template as the commands, and they can also define their own subcommands.

$ go run cobra.go echo times hello --times 3
Echo: hello
Echo: hello
Echo: hello
Copy the code

Cobra is widely recognized for its power and clarity. However, there are two issues that I am not satisfied with. They are small, but they stick with me and make me sad.

1 Parameter definition is separated from command logic

From the above definition of times you can see, the definition of parameters with the command logic (Run) here is the definition of separation, when we have a large quantum command, we are more likely to command definition into the different files and directories, this would be a command definition is distributed, and all the parameters of the command definition is together.

Of course, this problem can also be solved with COBRA simply by moving the parameter definition from the main function to the init function and splitting the init function along with the subcommand definition. For example, The Times subcommand is defined in the times.go file, and the init function is defined in the file, which defines the parameters of times. However, this leads to the need to define a large number of global variables when there are many parameters, which can be a pain in the neck for those who want clean, concise code with no side effects.

Why not put parameter definitions inside command functions, as flag libraries do? This makes the code more compact and the logic more intuitive.

// Why can't I write it like this? func times(){ cobra.IntVarP(&echoTimes, "times", "t", 1, "times to echo the input") cobra.Parse() }Copy the code

Believe everyone thought would understand, times function can only call after parsing command line parameters, which requires good command line parameters to define in advance, if put the parameters defined in times, this means that only when the call times function analysis related parameters, it is to keep the cell phone as unreasonable theme, according to the shell color transformation, however, Is it really so?

2 The order of child commands and parent commands is not flexible

CMD {resource} {action} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} {resource} Kubernetes, for example, uses action as a subcommand: Kubectl get Pods… kubectl get deploy … , and for the action of different resources are very different, often choose resource as a subcommand, such as aliyun command line tool: Aliyun ECs… aliyun ram …

In real development, it may be difficult at first to decide which action or resource is better as a subcommand, and this choice can be more difficult when there are multiple subcommands.

When not using any libraries, developers may choose to initialize the related resources in the parent command and execute the code logic in the child command, making it very difficult to swap the parent and child commands. Invoking a subcommand does not necessarily mean invoking the parent command. For command line tools, the process will exit when the command is executed. Resources initialized by the parent command will not be reused in the subcommand.

Cobra is designed to circumvent this error logic by providing a Run function in its sub-command, which should be implemented to initialize the resource, execute the business logic, and destroy the entire life cycle of the resource. However, Cobra still needs to define the parent command, that is, the echo command must be defined in order to define the echo Times child command. In fact, in many scenarios, the parent command has no execution logic, especially in the case of resource as the parent command. The only function of the parent command is to print the usage of the command.

Cobra makes the definition of child and parent commands very simple, but swapping the parent and child commands still requires modifying the links between them. Is there a way to make this process easier?

Understand the CLI again

There are many terms for command lines, such as arguments, flags, and options. Cobra is designed based on the definition of Commands representing Actions, Args are things and Flags are modifiers for those Actions.

In addition, more concepts are extended based on these definitions, such as persistent flags for all subcommands, local flags for only the current subcommand, required flags for mandatory flags, and so on.

These definitions are the core design source for COBRA, and we need to revisit them to address the two issues I mentioned above. To do this, let’s start from the beginning with a step-by-step analysis of what a command line is.

A command line is just a string that can be parsed and executed by the shell

$ cmd arg1 arg2 arg3
Copy the code

The command line and its arguments are essentially a string. Strings are interpreted by the shell. For the shell, a command line consists of commands and arguments separated by whitespace characters.

Anything else? No, there are no parent commands, no child commands, no persistent arguments, no local arguments, it doesn’t matter if an argument starts with a double dash (–), a single dash (-), or any other character, it’s just strings that the shell passes to the program you’re executing. And put it in os.args (Go language).

2 Parameters, Identifiers, and Options

As you can see from the above description, argument is the name of the whitespace delimited string at the end of the command line. An argument can have different meanings on the command line.

Arguments that start with a dash or double dash look a bit special, but when combined with code, this type of argument has a unique purpose: to associate a value with a variable in the code. This type of argument is called a flag. Recall that there are many parameters in the array os.args, which are not directly related to the variables in the command. What Flag provides is essentially a key-value pair. In our code, the function of assigning a value to the variable is realized by associating the key with a variable.

Flag.intvar (&limit, "limit", 10, "the Max number of results") // Variable binding. Assign the variable limitCopy the code

Flags give us the ability to assign values to variables in code directly from the command line. So a new question is, what if I hadn’t assigned this variable a value, would the program still be running? This parameter (flag is a special parameter) is mandatory if it cannot continue, otherwise it is optional. Another possibility is that the command line defines multiple variables, and any variable has a value, and the program can be executed. In other words, the program can be executed as long as one of these multiple identifiers is specified. In this sense, these identifiers or parameters can also be called options.

Through the above analysis, we found that the concepts of parameters, identifiers, and options are intertwined, with different and similar meanings. The identifier is the parameter starting with a dash, and the parameter (if any) after the identifier name is the value of the identifier. These parameters may be required, optional, or one of multiple options, so they are also called options.

3 subcommands

From the above analysis, it is easy to conclude that a subcommand is just a special argument that looks like any other argument (unlike the one that begins with a line), but that raises a special action or function (any action can be encapsulated as a function).

By comparing flags and subcommands, we can find an unexpected correlation: flags correlate variables and subcommands correlate functions! They have the same purpose, identifying that the following arguments are the value of the variable, so that all arguments following the subcommand are the parameters of the function (not the language-level function parameters).

The more interesting question is why does a logo need to start with a horizontal line? If there is no horizontal line, does it accomplish the purpose of associating variables? This is obviously fine, because there are no lines in a subcommand, and the association of variables is no different from the association of functions. Essentially, the association is implemented by an identity or the name of a subcommand, so what does the line do?

Whether it is associated with a variable or a function is still determined by the name of the parameter, which is predefined in the code. There is no line to distinguish identifiers from subcommands and to complete the association of variables or parameters.

Such as:

For _, arg := range os.Args{switch arg{case "limit": // Call scan function}}Copy the code

Thus, the logo does not play a special role in the implementation of core functions. The function of the horizontal line is mainly to enhance readability. However, it is important to note that although we do not need tokens per se, once we have them we can use their features for additional purposes, such as netstat -lnt, where -lnt is the syntactic sugar of -L-N-T.

4 Command line structure

After the above analysis, we can assign different concepts to the command line parameters

  • Flag: a parameter starting with a horizontal line or double horizontal line. The flag consists of the flagname and the flag parameter. Flagname FlagARg

  • Nonidentified parameter

  • Subcommands also have subcommands, identifiers and non-identifiers

    $ command –flag flagarg subcommand subcmdarg –subcmdfag subcmdflagarg

Four heuristic command line parsing

Let’s revisit the first requirement, which is that we expect the implementation of any subcommand to be as simple as using the standard library flag. This means that the command-line arguments of this function are parsed only when it is executed. If we can distinguish the subcommand from other parameters, we can execute the function corresponding to the subcommand first, and then parse the parameters of the subcommand.

Flag calls Parse from main because the shell already knows that the first entry in the string is the command itself, and all subsequent entries are parameters. Similarly, if we can identify subcommands, we can also make the following code possible:

Func command(){// define flags // call Parse function}Copy the code

The key problem is how to distinguish subcommands from other parameters, where the identifier name begins with a horizontal or double horizontal line, which can be easily distinguished from the molecular command, subcommand parameters, and identifier parameters. If you think about it carefully, you can identify a subcommand by comparing non-identifying parameters with a predefined subcommand, even though you don’t expect parameters to be predefined.

To demonstrate how to identify subcommands, let’s take the cobra code above as an example. Suppose that the Cobra. go code is compiled to the program APP, then its command line can be executed

$ app echo times hello --times 3
Copy the code

According to cobra, times is a subcommand of echo, which in turn is a subcommand of app. We use Echo Times as a subcommand of our app.

1 Simple analysis process

  1. Define the echo subcommand associated with the function echo, and the echo times subcommand associated with the function echoTimes
  2. Echo times hello –times 3
  3. Echo times = echo times = echo times = echo times = echo times = echo times = echo times = echo times = echo times
  4. Parsing the second argument, through times we match the echo times subcommand, which is no longer a prefix to any subcommand. The subcommand is echo times, and all other parameters are parameters of the subcommand.
  5. If the parsed second argument is hello, it matches only the echo subcommand, and the echo function is called instead of echoTimes.

2 heuristic detection process

The above parsing is simple, but in reality, we would like to allow flags to appear anywhere on the command line. For example, we would like to add an option to control the print color, color red. Logically, the color option is more a description of echo than times. So we expect the following command lines to be supported:

$ app echo --color red times hello --times 3
Copy the code

At this point, we expect the subcommand to be called echo Times, but the middle argument complicates things, because the argument red here may be an identifier for color (red), may be part of a subcommand, or may be an argument to a subcommand. Even worse, the user might write the wrong parameter –color times

Heuristic detection means that when red parameters are parsed, we do not know whether red is a subcommand (or a prefix part of a subcommand) or a subcommand parameter, so we can match it as a prefix of a subcommand, and if not, treat it as a subcommand parameter.

  1. When red is parsed, echo Red is used to search for predefined subcommands. If none is found, red is considered as a parameter
  2. When parsing times, use Echo times to search for predefined subcommands. The echo Times subcommand can be found

If red does not match any of the subcommands, it must be a subcommand parameter.

3 Subcommands are written in any order

A subcommand is essentially a string, and our heuristic parsing above recognizes any subcommand string, provided that the string is defined in advance. So you associate this string with some function. Such a design makes the parent and child commands logical concepts and irrelevant to the actual code implementation, and all we need to do is tweak the mapping.

Maintaining mappings

Echo times => echo times; echo times => echo timesCopy the code

Cortana: Implementation based on heuristic command line parsing

To do this, I developed a project called Cortana. Cortana introduces Btree to establish the mapping between subcommands and functions. Thanks to its prefix search capability, the program will automatically list all available subcommands when the user enters any subcommand prefix. The heuristic command line parsing mechanism enables you to parse out subcommands before parsing specific identifiers or subcommand parameters, so that you can search for functions mapped by subcommands. In the mapped functions, you can parse parameters of subcommands to achieve variable binding. In addition, Cortana takes full advantage of the Struct Tag feature of Go language to simplify the process of variable binding.

We used Cortana to re-implement the cobra code

package main

import (
  "fmt"
  "strings"

  "github.com/shafreeck/cortana"
)

func print() {
  cortana.Title("Print anything to the screen")
  cortana.Description(`print is for printing anything back to the screen.
For many years people have printed back to the screen.`)
  args := struct {
    Texts []string `cortana:"texts"`
  }{}

  cortana.Parse(&args)
  fmt.Println(strings.Join(args.Texts, " "))
}

func echo() {
  cortana.Title("Echo anything to the screen")
  cortana.Description(`echo is for echoing anything back. 
Echo works a lot like print, except it has a child command.`)
  args := struct {
    Texts []string `cortana:"texts"`
  }{}

  cortana.Parse(&args)
  fmt.Println(strings.Join(args.Texts, " "))
}

func echoTimes() {
  cortana.Title("Echo anything to the screen more times")
  cortana.Description(`echo things multiple times back to the user by providing
  a count and a string.`)
  args := struct {
    Times int      `cortana:"--times, -t, 1, times to echo the input"`
    Texts []string `cortana:"texts"`
  }{}
  cortana.Parse(&args)

  for i := 0; i < args.Times; i++ {
    fmt.Println(strings.Join(args.Texts, " "))
  }
}

func main() {
  cortana.AddCommand("print", print, "print anything to the screen")
  cortana.AddCommand("echo", echo, "echo anything to the screen")
  cortana.AddCommand("echo times", echoTimes, "echo anything to the screen more times")
  cortana.Launch()
}
Copy the code

The command is used exactly the same as COBRA except for the help messages that are automatically generated

$./app Available commands: print print anything to the screen echo echo anything to the screen echo times echo anything to the screen more times # The -h, --help option is enabled by default, $./app print -h print anything to the screen print is for printing anything back to the screen. For many years people have printed back to the screen. Usage: print [texts...]  -h, $./app echo hello world hello world # echo $./app echo hello world # echo $./app echo times hello world $./app echo --times 3 times hello world Hello world hello world hello worldCopy the code

1 Options and default values

args := struct {
    Times int      `cortana:"--times, -t, 1, times to echo the input"`
    Texts []string `cortana:"texts"`
}{}
Copy the code

As you can see, the echo times command has a –times identifier, plus the content to be displayed, which is essentially a command line argument and may be split into multiple arguments because of Spaces in the content.

As mentioned above, identification essentially binds a value to a variable, and its name, such as –times here, is associated with the variable args.Times. As for other parameters that are not identified, they have no name, so we uniformly bind them to a Slice, i.e. Args

Cortana defines its own Struct Tag that specifies the long Tag name, the short Tag name, the default value, and the description of the option. Cortana :” Long, short, default, description”

  • Long flagname (long) : –flagname, any flag supports the long flagname format, if not written, the default is the field name
  • Short: -f, which can be omitted
  • Default: Any value that matches the field type. If omitted, the value defaults to null. A single “-” line indicates that the user must provide a value
  • Description: The description of this option, used to generate help information, can contain any printable characters (including commas and Spaces)

For memorizing purposes, cortana can also be written as LSDD.

2 Subcommands and aliases

AddCommond can add any subcommand, essentially mapping the subcommand to its processing function.

cortana.AddCommand("echo", echo, "echo anything to the screen")
Copy the code

In this case, the print command and echo command are the same, and we can alias them

Cortana. Alias("print", "echo")Copy the code

The print command actually executes echo

$ ./app print -h Echo anything to the screen echo is for echoing anything back. Echo works a lot like print, except it has a child command. Available commands: echo times echo anything to the screen more times Usage: echo [texts...]  -h, --help help for the commandCopy the code

The aliasing mechanism is flexible enough to set aliases for arbitrary commands and arguments. For example, we want to implement the three subcommand to print an arbitrary string three times. This can be done directly through aliases:

cortana.Alias("three", $./app three hello world hello world Hello world hello worldCopy the code

3 help Identifies and commands

Cortana automatically generates help information for any command. This behavior can also be disabled by using cortana.DisableHelpFlag or by using cortana.HelpFlag to set your favorite flag name.

cortana.Use(cortana.HelpFlag("--usage", "-u")) # customize --usage to print the help information $./app echo --usage echo anything to the screen echo is for anything back.echo works a lot like print, except it has a child command. Available commands: echo times echo anything to the screen more times Usage: echo [texts...]  -u, --usage help for the commandCopy the code

Cortana doesn’t provide the help subcommand by default, but it’s easy to implement the help command ourselves using the alias mechanism.

Cortana. Alias("help", "--help") $./app help echo times echo anything to the screen more times echo things multiple times back to the user by providing a count and a string. Usage: echo times [options] [texts...]  -t, --times <times> times to echo the input. (default=1) -h, --help help for the commandCopy the code

4 Configuration files and environment variables

In addition to binding variables with command-line arguments, Cortana also allows you to customize the binding of configuration files and environment variables. Cortana is not responsible for parsing configuration files or environment variables. You can use third-party libraries to do this. Cortana’s main role here is to combine values from different sources based on priority. They follow the following order of priority:

Default value < configuration file < environment variable < parameterCopy the code

Cortana is designed to make it easy for users to use configuration in any format, simply by implementing the Unmarshaler interface, for example, using JSON as a configuration file:

cortana.AddConfig("app.json", cortana.UnmarshalFunc(json.Unmarshal))
Copy the code

Cortana leaves parsing of profiles or environment variables entirely to third-party libraries, leaving users free to define how to bind profiles to variables, such as using jsonTag.

5 No subcommands?

Cortana is designed to decouple command lookup from parameter resolution so that the two can be used independently, such as in the case of no subcommands, directly in the main function:

func main(){ args := struct { Version bool `cortana:"--version, -v, , Print the command version" '}{} cortana.parse (&args) if args.version {fmt.println ("v0.1.1") return} //... } $./app --version v0.1.1Copy the code

Six summarize

Command line parsing is a common, but not particularly important function, unless it is focused on command execution tools, we generally do not need to pay much attention to command line parsing, so I would like to express my sincere thanks to readers who are interested in the topic of this article and can read to the end of the article.

The flag library is easy to use and Cobra is feature-rich, and these two libraries already meet almost all of our needs. However, in the process of writing command line programs, I always felt that the existing libraries were inadequate. The Flag library only dealt with the problem of identity resolution, while the COBRA library supported the parsing of subcommands and parameters, but coupled the parsing of subcommands and parameters, resulting in the separation of parameter definition from function. The core appeal of Cortana is to decouple command lookup from parameter parsing. I achieved this goal by regressing to the essence of command line arguments and inventing heuristic parsing. This decoupling gives Cortana the richness of Cobra but also the experience of Flag. I’m comfortable with the idea of using a very simple mechanic to achieve a powerful experience through clever design, and I hope I can share my happiness with you through this article.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.