The PLUM reading group recently discussed the paper, DR CHECKER: A Soundy Analysis for Linux Kernel Drivers, which appeared at USENIX Securty’17. This paper presents an automatic program analysis (a static analysis) for Linux device drivers that aims to discover instances of a class of security-relevant bugs. The paper is insistent that a big reason for DR CHECKER’s success (it finds a number of new bugs, several which have been acknowledged to be true vulnerabilities) is that the analysis is soundy, as opposed to sound. Many of the reading group students wondered: What do these terms mean, and why might soundiness be better than soundness?
To answer this question, we need to step back and talk about various other terms also used to describe a static analysis, such as completeness, precision, recall, and more. These terms are not always used in a consistent way, and they can be confusing. The value of an analysis being sound, or complete, or soundy, is also debatable. This post presents my understanding of the definitions of these terms, and considers how they may (or may not) be useful for characterizing a static analysis. One interesting conclusion to draw from understanding the terms is that we need good benchmark suites for evaluating static analysis; my impression is that, as of now, there are few good options. Continue reading
Last week I attended the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). The event was hosted at Paris 6 which is part of the Sorbonne, University of Paris. It was one of the best POPLs I can remember. The papers are both interesting and informative (you can read them all, for free, from the Open TOC), and the talks I attended were generally of very high quality. (Videos of the talks will be available soon—I will add links to this post.) Also, the attendance hit an all-time high: more than 720 people registered for POPL and/or one of its co-located events.
In this blog post I will highlight a few of my favorite papers at this POPL, as well as the co-located N40AI event, which celebrated 40 years of abstract interpretation. Disclaimer: I do not have time to describe all of the great stuff I saw, and I could only see a fraction of the whole event. So just because I don’t mention something here doesn’t mean it isn’t equally great.
I just got back from CCS (ACM’s conference on Computer and Communications Security) in Vienna, Austria. I really enjoyed the conference and the city.
At the conference we presented our Build it, Break it, Fix it paper. In the same session was the presentation on a really interesting paper called Practical Detection of Entropy Loss in Pseudo-Random Number Generators, by Felix Dörre and Vladimir Klebanov. This paper presents a program analysis that aims to find entropy-compromising flaws in PRNG implementations; such flaws could make the PRNG’s outputs more predictable (“entropy” is basically a synonym for “uncertainty”). When a PRNG is used to generate secrets like encryption keys, less entropy means those secrets can be more easily guessed by an attacker.
A PRNG can be viewed as a deterministic function from an actually-random seed to a pseudo-random output. The key insight of the work is that this function should not “lose entropy” — i.e., its outputs should contain all of the randomness present in its seed. Another way of saying this is that the PRNG must be injective. The injectivity of a function can be checked by an automatic program analysis.
Using the tool they developed, the paper’s authors were able to find known bugs, and an important unknown bug in libgcrypt’s PRNG. Their approach was essentially a variation of prior work checking noninterference; I found the connection to this property surprising and insightful. As discussed in a prior post, security properties can often be challenging to state and verify; in this case the property is pleasingingly simple to state and practical to verify.
Both injectivity and noninterference are examples of K-safety properties, which involve relations between multiple (i.e., K) program executions. New technology to check such properties is being actively researched, e.g., in DARPA’s STAC program. This paper is another example of the utility of that general thrust of work, and of using programming languages techniques (automated program analysis) for proving security properties.
Matt at the White House, Jan 2015
This post is the second part of my March 10th interview of Matt Might, a PL researcher and Associate Professor in the Department of Computer Science at the University of Utah.
In Part I, we talked about Matt’s academic background, his PL research (including his favorite among the papers he’s written), and his work on understanding and treating rare disease, which began with the quest to diagnose his son Bertrand, and has led to a role in the President’s Initiative on Precision Medicine.
In this post, our conversation continues, covering the topics of blogging, privacy, managing a crazy schedule, and looking ahead to promising PL research directions. Continue reading
This post presents an interview I did on March 10th, 2015, with Matt Might, a PL researcher who is an Associate Professor in the School of Computing at the University of Utah.
Matt has made strong scientific contributions to the field of programming languages, and he has done much more. He maintains an incredibly popular blog on wide-ranging topics (13 million pageviews since 2009 on topics from abstract interpretation to how to lose weight to how to be more productive). He has also become deeply committed to supporting people with rare diseases, including his own son, Bertrand, who was the first person diagnosed with NGLY1 deficiency. His work on rare disease propelled him to the White House: He met the President on January 31st, 2015, and he took a position in the Executive Office of the President to accelerate the implementation of the Precision Medicine Initiative on March 21st.
We had an engaging conversation covering all of these topics. It is too long for one post, so this post is the first of two. Continue reading
Last week I attended a multi-day meeting for the DARPA STAC program; I am the PI of a UMD-led team. STAC supports research to develop “Space/time Analysis for Cybersecurity.” More precisely, the goal is to develop tools that can analyze software to find exploitable side channels or denial-of-service attacks involving space usage or running time.
In general, DARPA programs focus on a very specific problem, and so are different from the NSF style of funded research that I’m used to, in which the problem, solution, and evaluation approach are proposed by each investigator. One of STAC’s noteworthy features is its use of engagements, during which research teams use their tools to find vulnerabilities in challenge problems produced by an independent red team. Our first engagement was last week, and I found the experience very compelling. I think that both the NSF style and the DARPA style have benefits, and it’s great that both styles are available.
This post talks about my experience with STAC so far. I discuss the interesting PL research challenges the program presents, the use of engagements, and the opportunities STAC’s organizational structure offers, when done right.
In this post, I interview Peter O’Hearn, programming languages professor, researcher, and evangelist. Peter now works at Facebook on the Infer static analyzer, which was publicly released back in June 2015. In this interview we take a brief tour of Peter’s background (including his favorite papers) and the path that led him and Infer to Facebook. We discuss how Infer is impacting mobile application development at Facebook, and what Peter hopes it can achieve next. Peter also shares some lessons he’s learned at Facebook regarding PL research and the sometimes surprising impact PL researchers can and are having on industrial software development.
In this post I discuss a program security property called noninterference. I motivate why you might like it if your program satisfied noninterference, and show that the property is fundamental to many areas beyond just security. I also explore some programming languages and tools that might help you enforce noninterference, noting that while tainting analysis won’t get you the whole way there, research tools that attempt to do better have their own problems. I conclude with some suggestions for future research, in particular with the idea that testing noninterference may be a practical approach.
In my last post, I summarized some of the topics and problems considered at a recent Dagstuhl seminar I co-organized on the Synergy between Programming Languages and Cryptography. The post surveyed how programming languages often interface with cryptography in the construction of secure systems, and in particular how they can make it easier to implement cryptography, use it, or verify its correctness.
Beyond using PLs as a tool for easier/safer use of Crypto, there is an opportunity for certain kinds of thinking, or reasoning, to cross over fruitfully between the PL an Crypto communities. In particular, both communities are interested in formalizing systems and proving properties about them but they often use different methods, either due to cultural differences, or because the properties and systems of interest are simply different. During the workshop we identified both analogous, similar styles of reasoning in two communities and connection points between the different styles of reasoning. In this post I briefly highlight a few examples of each, and point to future research opportunities.
In this post, I introduce the emerging area of probabilistic programming, showing how probabilistic programs will hopefully make it easier to perform Bayesian-style machine learning, among other applications. Probabilistic programming is an exciting, and growing, area of research, with fantastic people in both AI/ML and PL working together and making big strides. PL methods — including formal semantics, optimization techniques, and forms of static analysis — have proven very useful in advancing this area forward.