What is memory safety?

I am in the process of putting together a MOOC on software security, which goes live in October. At the moment I’m finishing up material on buffer overflows, format string attacks, and other sorts of vulnerabilities in C. After presenting this material, I plan to step back and say, “What do these errors have in common? They are violations of memory safety.” Then I’ll state the definition of memory safety, say why these vulnerabilities are violations of memory safety, and conversely say why memory safety, e.g., as ensured by languages like Java, prevents them.

No problem, right? Memory safety is a common technical term, so I expected its definition would be easy to find (or derive). But it’s much trickier than I thought.

My goal with this post is to work out a definition of memory safety for C that is semantically clean, rules out code that seems intuitively unsafe, but does not rule out code that seems reasonable. The simpler, and more complete, the definition, the better. My final definition is based on the notion of defined/undefined memory and the use of pointers as capabilities. If you have better ideas, I’d love to know them!

Continue reading

69 Comments

Filed under PL in practice, Semantics, Software Security

Program verification in the undergraduate CS curriculum

[This is the first part of a two-post series. The second part is here.]

In previous posts on this blog, I have talked about how research on programming languages and tools can contribute to K-16 education. In this post, I will share with you some of my experiences while trying to use a PL tool — specifically, a semi-automated program verifier — in an undergraduate algorithms class. The experience was a success in many ways. A program verifier gives students a hands-on understanding of how algorithms are mathematical objects, and verification tools are now mature enough for productive classroom use. I think every CS program would benefit students by introducing them to these tools at some point. At the same time, the experience exposed me to some serious limitations of current-day program verification, and opened up a number of directions for new research in this area. Continue reading

5 Comments

Filed under Education, Formal verification

Spotlight: Aws Albarghouthi

Our “New Scientists on the Block” series features up-and-coming PL researchers who are about to begin independent research careers in universities or research laboratories. This week, we  interview Aws Albarghouthi, who is starting as an assistant professor at the University of Wisconsin, Madison, in January 2015.
Continue reading

3 Comments

Filed under Interviews, New scientists

IEEE posts its top list of languages

In a funny coincidence following my last post on language adoption, the IEEE Spectrum in my mailbox today contained a pointer to IEEE’s version of the top N languages.

They use a variety of data sources, some of which overlap with the SocioPLT’s, e.g., counting projects on GitHub. Some sources are different, e.g., using Google and Twitter to count occurrences of “X programming” for a language X. The site allows you to create your own rankings to weigh the various data sources differently, e.g., to focus only on GitHub projects or to increase the weight of Google Trends.

Comparing the IEEE top twenty with the SocioPLT top twenty shows some interesting differences.

Continue reading

6 Comments

Filed under Uncategorized

Why are some languages adopted and others aren’t?

In my last article I discussed how the failure to find the Heartbleed bug sooner was in some sense a failure to refine or deploy what is otherwise effective technology for static analysis. In particular, commercial static analysis tools purposely will ignore potential bugs so as to avoid reporting too many false alarms, i.e., favoring completeness over soundness. The companies that make these tools aim to provide a profitable service to a broad market, and their own investigations indicate soundness is not important for sales. Instead, to be viable, tools must help developers find real, important bugs efficiently, and not necessarily every bug. A challenge to researchers is to find ways to push the business proposition back toward soundness while retaining efficiency (and other desirable criteria); Andy Chou’s POPL’14 keynote outlines other useful challenges.

While Heartbleed is ostensibly about the adoption and improvement of static analysis, in this article I explore the related question of fostering the adoption of programming languages. I summarize impressive research by Leo Meyerovich and Ariel Rabkin on adoption research questions and adoption practices that appeared at OOPSLA’12 and OOPSLA’13, respectively. I think there are some interesting results here, with implications for improving the adoption of languages. Their results also raise new questions for further research (but too late for yesterday’s POPL deadline — good luck to all submitters!).

Continue reading

26 Comments

Filed under Language adoption, PL in practice

Programming language research for K-16 education (Part II)

In this post, I’ll continue our ongoing discussion of applications of PL research in computer-assisted education. Specifically, I’ll summarize a talk that Loris D’Antoni of Penn gave at this year’s Workshop on Programming Language Technology for Massive Open Online Courses (PLOOC). I was intrigued by this work, and I think a lot of you may be too.

Continue reading

Leave a Comment

Filed under Education, MOOCs

How did Heartbleed remain undiscovered, and what should we do about it?

It was recently reported that the Heartbleed OpenSSL bug has still not been patched on over 300,000 servers. This is down from 600,000 that were discovered two months ago, when the bug was first publicized. Add to this delay the nearly two years that the bug went undetected, and that’s a lot of exposure.

In this blog post, I wonder about how programming languages and tools might have played a role in finding Heartbleed before it got into production code. Then I explore possible approaches to improving this state of the affairs. I welcome your thoughts and ideas! Continue reading

14 Comments

Filed under Formal verification, Software Security

Spotlight: Emina Torlak

From time to time, the PL Enthusiast will publish interviews of “new scientists on the block”: prominent PL researchers who are about to start, or just started, independent research careers in universities or research laboratories. This week, we feature Emina Torlak, who is starting as an assistant professor at the University of Washington in Fall 2014.

Continue reading

2 Comments

Filed under Interviews, New scientists

ADS, generically, part II

This post is the second part of a post on using ideas from programming languages to assist the design of authenticated data structures (ADSs). I will describe a small programming language extension for building ADSs that I co-developed with Andrew Miller, Jonathan Katz, and Elaine Shi, called LambdaAuth. A paper on this language was presented at POPL’14 and the implementation is freely available.

As a word of warning, this post goes into some technical detail. The goal of doing so is to make, in detail, a general point: programming language researchers’ focus on ways of programming (often embodied as language features, patterns, transformations, or analyses), rather than particular programs, yields broadly applicable results, extending outside of the motivating examples or domain. In this case, LambdaAuth embodies a simple mechanism for programming any authenticated data structure, not just one particular kind.

Continue reading

Leave a Comment

Filed under Secure computation, Software Security, Uncategorized

Programming language research for K-16 education

At the recent PLDI conference, Armando Solar-Lezama and I organized a workshop called PLOOC: “Programming Language Tools for Massive Open Online Courses.” The high-level goal of the workshop was to discuss ways in which tools coming out of PL research can be used in K-16 education. Over the years, PL researchers have developed many techniques for automating and simplifying the design and analysis of programs. For the most part, these techniques have targeted the professional programmer. However, techniques developed for industrial code can also be applied to student-written programs in computer science courses.

Continue reading

1 Comment

Filed under Education, MOOCs