This is the 17th day of my participation in the First Challenge 2022

Preface

Last November, I applied for Github Copilot for internal testing. Today, I finally received the invitation. The whole afternoon, the overall feeling is very surprised, can’t wait to write a user experience.

Github Copilot is an AI programming tool jointly launched by OpenAI and Github, which automatically generates full code based on partial code or comments entered by users.

Warm-up

The first line of code I typed was def GCD (): as expected, Copilot did the greatest common divisor function for me, and it also associated different templates depending on what I typed.

When there is no parameter, the AI completion code will automatically print out the result, and when there is a parameter, the maximum common divisor will be returned as the result. In addition, after multiple attempts, the AI completed different codes, such as recursive templates and non-recursive templates. Mac users can use
+ ‘] ‘/’ [‘ to switch. I haven’t found out how to switch code completion results in PyCharm for Windows users.

However, template-level completion is not the main focus of Copilot, and Copilot has a number of notable features, according to the official introduction.

Function

Convert comments to code

Copilot can complete the code according to the comments in the code, and to my surprise, it also supports Chinese comments, which makes it more convenient for Chinese users. I tried it and it was a great experience.

Type the comment # to get the current time and automatically generate the function get_current_time() :

def get_current_time() :
    return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
Copy the code

Furthermore, you can constrain the generated code according to the desired effect. For example, if I change the comment to # get the current time, ‘xx year xx month XX day’, the suggested code return value becomes the specified format:

def get_now_time() :
    now_time = time.strftime('%Y年%m月%d日', time.localtime(time.time()))
    return now_time
Copy the code

Based on the large-scale pre-training model GPT-2, Copilot has excellent natural language understanding and is impressive in its understanding of annotations. I tried many more, with almost satisfactory results.

What day of the week is February 28 next year?

# Transfer one person's warmth to another's chest

In automatic code generation for simple task to save time at the same time, it is easy to note that the natural language to code transformation also can bring convenient experience used in another scenario, that is, when we are not familiar with some libraries that AI complete functions or provide templates, make us from frequent reading documents, find information.

For example, if I want to get the position of the mouse, but have no idea about the relevant library, I can let the AI finish the code with a single comment. On the one hand, IT saves time, and on the other hand, I don’t have to search and try again and again to use it, which is convenient for me to get started quickly.

Get the current mouse position

def get_mouse_position() :
    import pyautogui
    print(pyautogui.position())
Copy the code

It smells so good!

Autofill for repetitive code

In addition, AI can help you complete repetitive code. The official website shows an example like this:

Needless to say, repetitive work is boring and a waste of time. Copilot eliminates the problem of repetitive code at a stroke and really improves the coding experience.

Tests without the toil

Copilot’s website also shows a test assist feature, “effortless testing.” With Copilot, you can easily import a unit test package that matches your code:

Hands-on

Now that I have a good tool, I need to find a coding task to get a feel for it. A few days ago, I used code violence to solve the community’s Spring Festival mini-game Hopscotch:

🎮 Boring weekend, come to play [IQ detector] title excerpt from UP main [GM’s Secret base] – Nuggets (juejin. Cn) 🎮 click to play: Hopscotch – PyCharm (Juejin) – PyCharm (Juejin)

This is a 100-line task with a depth-first search as the main logic. I tried to use the AI to generate all the code for the task, and in doing so, if the AI were a programmer, my role would be to pair programming with it as a supervisor. I’m responsible for expressing ideas in comments and reviewing the code it generates. All of the code is generated by AI except for the initial data, which is the relevant properties of the board.

Defining and initializing lists is a simple task, but Copilot’s genius is that it allows me to specify usage when defining data, which gives me the freedom to declare data structures that the AI can refer to when generating subsequent code to get the correct code.

# define three lists, respectively:
# 1. A node has been occupied
# 2. Security state stack, format [[occupied nodes], [All possible next steps]]
# 3.solution
Initialize to an empty list
occupied = []
safe_stack = []
solution = []

# define a list of nodes that are not in use, 0 to 14
unused = list(range(15))
Copy the code

When defining a function, as long as the input, output and logic of the function are clearly written in the comments, the AI can generate the code I want correctly, and I’m really happy with the level of completion. Take the following function for example:

# define a function that retrieves all possible next steps as a node number
# pair all unoccupied nodes and save all matching results to all possible next steps
# For each possible next step, judge whether the parameter and the next two nodes are on the same line and the parameter must be greater or smaller than the next two node numbers. If the conditions are not met, it will be removed from the possible next step
Return all possible next steps
def get_next_step(node) :
    next_step = []
    for i in range(len(unused)):
        for j in range(i+1.len(unused)):
            if check_line(node, unused[i], unused[j]):
                if (node < unused[i] and node < unused[j]) or (node > unused[i] and node > unused[j]):
                    next_step.append((node, unused[i], unused[j]))
    return next_step
Copy the code

Function generation is not a particularly complex task for Copilot because the function logic is simple and easier to describe. But master functions are different, especially if they require careful design of loop logic. In the case of AI, it is generated based on the probability of prediction, and even if I do not provide any description, there will always be a prediction with the highest probability. Let’s look at what the AI does with the main function without any description:

Type # after the main function, the AI to complement the main function of statement, followed by 20 pour down line contains comments the main function of the code, it is the Copilot according to I guess my intention, I above you can see, though not I want the code, but the finish is very high, and also solve the certain problem, indeed to write properly.

Since my main function logic is not that simple, I need to let the AI write the code while adding comments to give it more fine-grained semantic information to help it complete the correct code:

# main function
def main() :
    # change 0 never occupied to occupied
    occupied.append(0)
    unused.remove(0)
    Exit the loop when there is only one unoccupied
    while len(unused) > 1:
        Create a new security state [occupied list, [all possible next steps]]
        safe_stack.append([occupied[:], get_all_next_step()])
        The next step is a space-time loop
        while len(safe_stack[-1] [1= =])0:
            Eject the current security state from the security state stack
            safe_stack.pop()
            Remove the first next step of the last element in the security stack
            safe_stack[-1] [1].pop(0)
            # back
            If backtracking fails, print an error message and exit
            try:
                back_action()
            except IndexError:
                print("No solution")
                exit()
        Execute the first element of the next list
        action(*safe_stack[-1] [1] [0])
    # remove 1, append 2, append 3
    if len(unused) == 1:
        print('Solution:')
        for i in range(len(solution)):
            print('remove %d, append %d, append %d' % (solution[i][0], solution[i][1], solution[i][2]))
Copy the code

Finally, it completes the main function. Overall, I used more than 30 lines of comments to let Copilot help me complete all of the nearly 80 lines of code, and it ran smoothly and got the correct result. I was quite satisfied with the result. After all, if someone were to write code the way I do, thirty sentences wouldn’t necessarily explain my intentions, and I think the AI has done a pretty good job of that.

However, it has room for further improvement, both in its lack of understanding of complex semantics and in its inflexible, inflexible side of AI as an algorithm rather than a person. For example, I met above to complete the task in the process of a few small problems, such as the occasional details, no matter how much I change, can’t make it right, almost always, such as # a must be greater than or less than b and c at the same time, it is always interpreted must satisfy (a < b < c) or (a > b > c), Finally, I couldn’t stand it anymore, so I went to correct the place myself.

However, this problem is not very common, most of the time it is very “obedient”, for example, sometimes I found that the copy is directly assigned with global variables, I will put parentheses in the comment to remind it of the deep copy. When manipulating a list, you can also remind it to concatenate (+=) instead of append (append).

Summary

This test gave me a good feeling, because it gave me a higher than expected experience, really can be described as a surprise.

When I first heard about Copilot, I didn’t think it was a “next word prediction” type of AI. After all, it was officially trained on Github’s public code base. Give me the data and I’ll get one. But once I got into it, I realized it wasn’t that easy, because it generated code from comments and was “smart” like a person in every respect, which I couldn’t do.

Being able to understand comments, generate code, and say that the AI should be cross-lingual or even cross-modal was hard enough for me, but I didn’t even know what the core of the code-generation task was. On the other hand, with the huge amount of code on Github as a data set, I think it’s self-supervised learning at all, and I have no idea how developers are going to make their models learn from that data. In this case, my experience with Copilot turned out to be so satisfying that I was not only amazed by the technology, but also in awe of the development team behind the product.

DeepMind also released its Own AlphaCode a few days ago. Although I haven’t seen it in detail, the involvement of these well-known AI research institutions already gives me a sense of the historical evolution behind this quiet technological revolution. And the oncoming spring tide — the era of AI Programming is about to blossom.

Today is the beginning of spring, I wish you a happy spring day.