Automated analysis of programs is one of the major success stories in PL. The goal here is to algorithmically infer properties of a program’s runtime behavior without executing the program on concrete inputs. This may be done for many reasons, including optimization and reasoning about correctness. If you are trying to optimize a program, it helps to know that a statement executed within a loop always performs the same update, and can therefore be moved out of the loop. If you want to be certain that your C program doesn’t have buffer overflows, you want to infer bounds on the indices used for array accesses in the program.
Over the years, systems for program analysis have increased in sophistication and entered the mainstream of software development. But how do you know that what your analysis tells you is correct? To be certain that it is, we must relate the program’s semantics – what the program does at runtime – to what the analysis algorithm computes. The framework of abstract interpretation is the gold standard for doing so.
Radhia Cousot, co-inventor of abstract interpretation, passed away earlier this summer after a long struggle with cancer. While her death was tragic, I am consoled that she lived to see her work impact the world in a way that most researchers can only dream of.
Radhia and her husband Patrick first presented abstract interpretation in a POPL 1977 paper. Abstract interpretation is a unified framework for automated, sound generation of program invariants. What this means is that a program analysis grounded in abstract interpretation infers facts that hold at each time point in each program execution, and that these inferred facts are guaranteed to be correct. For example, if a sound analysis tells you that the program never experiences a divide-by-zero error, then this judgment is in fact true. On the other other hand, the analysis is not necessarily complete, meaning that it may not infer all interesting properties that hold at runtime.
In abstract interpretation, a program analysis is seen as computing an approximation to a program’s concrete semantics – a mathematical characterization of what the program does at runtime. For instance, a program’s concrete semantics may be given by the set T of all program executions. A sound program analysis would compute, in general, an overapproximation O of this set. An invariant satisfied by all executions in O (for example, “there are no divide-by-zero errors”) is also guaranteed to hold in all executions in T.
Abstract interpretation characterizes this notion of overapproximation by viewing a class of assertions about programs as forming a lattice: a partially ordered set where every two elements have a least upper bound and a greatest lower bound. The order in the lattice captures precision: higher up in the lattice we have less precise properties. Similarly, the concrete semantics of a program is viewed as a lattice. For example, elements in the concrete lattice can represent sets of program traces, and the order between them can represent set inclusion. The two lattices are related by an abstraction function that maps elements in the concrete lattice to assertions that can be proved about them, and a concretization function that maps each assertion to what it means in terms of program semantics. The elegant observation of abstract interpretation was to frame the notion of soundness of a program analysis can be captured in terms of a special order-theoretic relationship, called a Galois connection, involving the abstraction and concretization functions.
An appeal of this formulation was that it offers a parameterized framework for program analysis. You had to instantiate the abstract and concrete domains in different ways, and you would get program analysis algorithms for different languages and of varying precision. Also, composing different analyses or comparing their precision was now reduced to certain elementary mathematical tasks.
Through the three and a half decades since their 1977 paper, Radhia and Patrick and their research collaborators tirelessly explored the theory and practice of abstract interpretation. On the theoretical end, they showed how a wide range of analysis techniques, for a broad spectrum of programming languages, could be unified under the umbrella of abstract interpretation. Concomitantly, they built systems for program analysis, firmly rooted in the theory of abstract interpretation, for settings ranging from embedded software to web applications. The ASTREE system for the analysis of real-time embedded systems, built under their supervision of Radhia and Patrick, was a watershed in the history of program analysis. In 2003, ASTREE was used to prove, completely automatically, that the primary flight control software of the Airbus A340 fly-by-wire system was free of runtime errors like division by zero, array bound violation, or violation of user-written assertions. In 2008, ASTREE was able to automatically prove that the automated docking software of the Jules Verne automated transfer vehicle, used to transfer payloads to the International Space Station, was free of runtime errors.
Aside from being a great researcher, Radhia was also extraordinarily warm and kind as a person. I first met her a couple of years into my career as an independent researcher. She was patient in listening to where my work was going and offered kind, detailed and insightful feedback. Since then, we talked quite a few times at different conferences. Each of these conversations was a joy. Many others who worked or interacted with Radhia have felt this way.
Radhia created a science out of what was a previously a mishmash of algorithmic methods. The flowering of algorithms and tools for program analysis in the last two decades bears a sizable debt to this science. Her passing is a tremendous loss to our field.