The original address: www.pathsensitive.com/2021/03/dev…

Original author: www.jameskoppel.com/

Published: March 28, 2021

Nine years ago, I started working on advanced development tools. When I started, “programming tools” meant file format viewers, editors, and maybe grep variants. I mentioned a deeper problem, such as inferring the basic intent of a set of changes and figuring out how it compares to find-and-replace.

Times have changed. I was no longer shocked to meet a programmer who had heard of synthesis and even tried validation tools. There are now several popular products based on advanced tool research, and advances in artificial intelligence have generally changed people’s expectations. One company, Facebook, has even deployed automated fixes internally.

Even so, tool research is still light years ahead of what’s deployed. It’s not unusual to read a 20-year-old paper with a tool that has been shown to make programmers four times faster at a task, while the underlying idea remains locked in academia.

I want to give you a sense of what to expect from advanced tools — and the way we’re going backwards. Now I’m going to introduce three of my favorite tools of the past 30 years, all of which I’ve tried to use and none of which are currently running.

Reflection model

We often think of software in terms of components. For an operating system, this might be: file system, hardware interface, process manager. An experienced engineer on a project asked to make certain files write to disk faster will know exactly where they are in the code; A newcomer will see an amorphous blob source file.

In 1995, as a young graduate student at the University of Washington, Gail C. Murphy came up with a new way to learn about a code base called reflection modeling.

First, you make a rough assumption about what you think these components are and how they interact.

Then, you go through the code and write down how you think each file corresponds to the component.

Now the tool runs and calculates the actual connectivity of the file (e.g., class inheritance, call diagram). You compare it to your hypothesis.

With new evidence, you refine your assumptions, making your mental model more and more detailed and realistic.

Around this time, a group at Microsoft was doing an experiment to see if they could redesign the Excel code base to extract some advanced components. They needed to have a fairly strong understanding of the code base, but getting it wasn’t that easy because they were different teams, in different buildings. One of them saw Gail’s talk on reflection models and liked it.

In a day, he created his first slice of a reflection model for Excel. He then spent four weeks perfecting it as he became more familiar with the code. In doing so, he reached a level of understanding that he estimated would take two years.

Today, Gail’s original RMTool has disappeared from the Internet. Ciao, AT&T’s C++ analysis tool on which it is based, is even more off the Internet. Then they wrote a Java version, jRMTool, but it was just for older versions of Eclipse, with a completely different API. The code was written in Java 1.4 and was not even syntactically correct. I soon gave up trying to make it work.

Software Engineering in 2021. Still catching up with 1995

WhyLine

About a decade later, at Carnegie Mellon’s Human-Computer Interaction Institute, Amy Ko pondered another question. Debugging is like a detective. Why doesn’t the program update the cache after fetching? What do negative numbers do here? Why is it so hard to answer these questions?

Amy had an idea that she wanted to make a tool called Whyline in which you could ask questions like “why did this happen _____?” in an interactive debugger. . She built a prototype for Alice, CMU’s graphical programming tool that allows children to create 3D animations. People are impressed.

Buoyed by her success, Amy, now a professor, spent several more years working hard to build the technology to do just that in Java.

They conducted a study. Twenty programmers were asked to fix two errors in ArgoUML, a 150,000-line Java program. Half of the programmers got a copy of Java WhyLine. Programmers with WhyLine are four times more successful and work twice as fast than those without it.

A few years ago, I tried using Java WhyLine. When confronted with modern Java bytecode, it crashed.

matcher

My mentor, Armando Solar-Lezama, arrived at MIT in 2008 and single-handedly revitalized the field of program synthesis. His main focus is on complex problems in small systems, such as optimizing physics simulations and bit machines. Now he wants to solve simple problems in large systems. Much of the programming involves writing “glue code,” taking a large library of standard components and trying to tie them together. It can take weeks of digging through documentation to figure out how to do something in a complex framework. Can integrated technology help? Kazakh genius Kuat Yessenov’s mission is to find a way.

Glue code is often a game to figure out what classes and methods to use. Sometimes it’s not hard to guess: in Android, for example, the way you put a widget on the screen is with the container’s addView method. Usually, it’s not that easy. When writing an Eclipse plug-in for syntax highlighting, you need a chain of four classes to connect the TextEditor object to the RuleBasedScanner.

class UserConfiguration extends SourceViewerConfiguration {
  IPresentationReconciler getPresentationReconciler(a) {
    PresentationReconciler reconciler = new PresentationReconciler();
    RuleBasedScanner userScanner = new UserScanner();
    DefaultDamagerRepairer dr = new 
    DefaultDamagerRepairer(userScanner);
    reconciler.setRepairer(dr, DEFAULT_CONTENT_TYPE);
    reconciler.setDamager(dr, DEFAULT_CONTENT_TYPE);
    returnreconciler; }}class UserEditor extends AbstractTextEditor {
  UserEditor() {
    userConfiguration = newUserConfiguration(); setSourceViewerConfiguration(userConfiguration); }}class UserScanner extends RuleBasedScanner {... }Copy the code

He reasoned that if you could figure out the two endpoints of a function, what class uses it and what class provides it, then you could ask the computer to figure out something in between. There are other programs that do what you’re looking for. By running them and analyzing the traces, you can find the code responsible for “connecting” the two classes (as a chain of pointer references). You can then boil the reference program down to code that implements those functions — voila, a tutorial! The MatchMaker tool was born. The MatchMaker tool was born.

In the study, eight programmers were asked to build a simple syntax highlighter for Eclipse, highlighting two keywords in a new language. Half were given MatchMaker and a short tutorial on how to use it. Yes, there are multiple tutorials on how to do this, but they are too informative and unhelpful. The control group was at a loss, averaging 100 minutes. MatchMaker users quickly figured out what they were looking for, in just 50 minutes. That’s not too bad considering it took a five-year Eclipse expert a full 16 minutes.

I actually did use Matchmaker, because during my first month of graduate school, I was asked to be involved in the development of its follow-up products. Pretty good; I’d love to see it fleshed out and adapted for Android. Well, we’re sliding back. A few years ago, my mentor hired a summer intern to develop MatchMaker. He immediately hit a snag: It didn’t work on Java 8.

lesson

The first lesson is that the tools we use are largely determined by the choices of exceptional people. Mylyn is one of the most popular Eclipse plug-ins out there, simply because Gail C. Murphy, the creator of Reflexion Models, decided to go into academia Mik Kersten, the creator of her student Mylyn, went into industry.

Programming tools are not a field in which progress is “an idea whose time has come.” This happens when there are a lot of people working on similar ideas; If one person’s idea isn’t adopted, someone else will be years later. This kind of competition is rare among programming tools. Let me give you an example. A famous professor took a leave of absence to start a company to build a tool for making websites. I asked him if his idea was to beat all such tools before, why no one had done it before. And he said, “Because it requires technology that only I can build.”

The second lesson is that there is something wrong with the way we build programming tools. No other area of computer science seems to have such a chasm between the achievements of researchers and practitioners. I used to think that this is because the difficulty of building tools more depends on the complexity of the programming language (programming language are extremely complex, look at c + +), rather than thought, before this change, there is not enough sales to pay huge fixed costs, build tools any tools can’t exist. That’s why my doctor has been working on making tools easier to build. That’s partly why I’m frustrated by the proliferation of free but not advanced tools: It throws off the bottom line of the market and makes those fixed costs harder to repay.

But the third lesson is that as developers, we can demand more from our tools. If you’ve ever wanted to build a developer tool, you have so many impressive works to draw from. If you’re hungry for better tools, this is what you’re looking forward to.

source

  • Software reflection model. Narrow the gap between the source model and the high-level model
  • The reflection model is used for reconstruction. A case study.
  • Design Whyline. Design Whyline: a debugging interface for asking questions about program behavior.
  • Use Java Whyline to find the cause of the program output.
  • Data-driven synthesis techniques for object-oriented frameworks