Category Archives: Software Security

Software security is a branch of Cybersecurity whose focus is on making software secure, so that it meets its security goals despite adversarial influence.

Evaluating Empirical Evaluations (for Fuzz Testing)

How do we know what we know? That question is the subject of study for the field of epistemology. Per Wikipedia, “Epistemology studies the nature of knowledge, justification, and the rationality of belief.” Science is one powerful means to knowledge. Per the linked Wikipedia article, “Science is viewed as a refined, formalized, systematic, or institutionalized form of the pursuit and acquisition of empirical knowledge.”

Stopwatch

Gathering Empirical Evidence

Most people are familiar with the basic scientific method: Pose a hypothesis about the world and then carry out an experiment whose empirical results can either support or falsify the hypothesis (and, inevitably, suggest additional hypotheses).  In PL research we frequently rely on empirical evidence. Compiler optimizations, static and dynamic analyses, program synthesizers, testing tools, memory management algorithms, new language features, and other research developments each depend on some empirical evidence to demonstrate their effectiveness.

A key question for any experiment is: What is the standard of empirical evidence needed to adequately support the hypothesis?

This post is a summary of a paper co-authored by George T. Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and me that will appear in the Conference on Computer and Communications Security, this Fall, titled “Evaluating Fuzz Testing.”[ref]I also presented our work at the ISSISP’18 summer school, though the work has matured a bit since then.[/ref] It starts to answer the above question for research on fuzz testing (or simply, fuzzing), a process whose goal is to discover inputs that cause a program to crash.  We studied empirical evaluations carried out by 32 research papers on fuzz testing. Looking critically at the evidence gathered by these papers, we find that no paper adheres to a sufficiently high standard of evidence to justify general claims of effectiveness (though some papers get close). We carry out our own experiments to illustrate why failure to meet the standard can produce misleading or incorrect conclusions.

Why have researchers systematically missed the mark here? I think the answer owes in part to the lack of an explicit standard of evidence. Our paper can be a starting point for such a standard for fuzz testing. More generally, several colleagues and I have been working on a checklist for empirical evaluations in PL/SE-style research. We welcome your feedback and participation!

Continue reading

11 Comments

Filed under Process, Science, Software Security, Software Testing

Software Security is a Programming Languages Issue

This is the the last of three posts on the course I regularly teach, CS 330, Organization of Programming Languages. The first two posts covered programming language styles and mathematical concepts. This post covers the last 1/4 of the course, which focuses on software security, and related to that, the programming language Rust.

This course topic might strike you as odd: Why teach security in a programming languages course? Doesn’t it belong in, well, a security course? I believe that if we are to solve our security problems, then we must build software with security in mind right from the start. To do that, all programmers need to know something about security, not just a handful of specialists. Security vulnerabilities are both enabled and prevented by various language (mis)features, and programming (anti)patterns. As such, it makes sense to introduce these concepts in a programming (languages) course, especially one that all students must take.

This post is broken into three parts: the need for security-minded programming, how we cover this topic in 330, and our presentation of Rust. The post came to be a bit longer than I’d anticipated; apologies!

Software source code

Security is a programming (languages) concern

Continue reading

45 Comments

Filed under Education, Software Security, Types

POPL’17 in Paris: Some highlights

Last week I attended the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). The event was hosted at Paris 6 which is part of the Sorbonne, University of Paris. It was one of the best POPLs I can remember. The papers are both interesting and informative (you can read them all, for free, from the Open TOC), and the talks I attended were generally of very high quality. (Videos of the talks will be available soon—I will add links to this post.) Also, the attendance hit an all-time high: more than 720 people registered for POPL and/or one of its co-located events.

In this blog post I will highlight a few of my favorite papers at this POPL, as well as the co-located N40AI event, which celebrated 40 years of abstract interpretation. Disclaimer: I do not have time to describe all of the great stuff I saw, and I could only see a fraction of the whole event. So just because I don’t mention something here doesn’t mean it isn’t equally great.[ref]I also attended PLMW just before POPL, and gave a talk. I may discuss that in another blog post.[/ref]

Continue reading

14 Comments

Filed under Program Analysis, Research, Scientists, Semantics, Software Security, Types

PRNG entropy loss and noninterference

I just got back from CCS (ACM’s conference on Computer and Communications Security) in Vienna, Austria. I really enjoyed the conference and the city.

At the conference we presented our Build it, Break it, Fix it paper. In the same session was the presentation on a really interesting paper called Practical Detection of Entropy Loss in Pseudo-Random Number Generators, by Felix Dörre and Vladimir Klebanov.[ref]You can download the paper for free from the conference’s OpenTOC.[/ref] This paper presents a program analysis that aims to find entropy-compromising flaws in PRNG implementations; such flaws could make the PRNG’s outputs more predictable (“entropy” is basically a synonym for “uncertainty”). When a PRNG is used to generate secrets like encryption keys, less entropy means those secrets can be more easily guessed by an attacker.

A PRNG can be viewed as a deterministic function from an actually-random seed to a pseudo-random output. The key insight of the work is that this function should not “lose entropy” — i.e., its outputs should contain all of the randomness present in its seed. Another way of saying this is that the PRNG must be injective. The injectivity of a function can be checked by an automatic program analysis.

Using the tool they developed, the paper’s authors were able to find known bugs, and an important unknown bug in libgcrypt’s PRNG. Their approach was essentially a variation of prior work checking noninterference; I found the connection to this property surprising and insightful. As discussed in a prior post, security properties can often be challenging to state and verify; in this case the property is pleasingingly simple to state and practical to verify.

Both injectivity and noninterference are examples of K-safety properties, which involve relations between multiple (i.e., K) program executions. New technology to check such properties is being actively researched, e.g., in DARPA’s STAC program. This paper is another example of the utility of that general thrust of work, and of using programming languages techniques (automated program analysis) for proving security properties.

Continue reading

5 Comments

Filed under Program Analysis, Software Security

Expressing Security Policies

[This blog post was conceived by Steve Chong, at Harvard, and co-authored with Michael Hicks.]

Enforcing information security is increasingly important as more of our sensitive data is managed by computer systems. We would like our medical records, personal financial information, social network data, etc. to be “private,” which is to say we don’t want the wrong people looking at it. But while we might have an intuitive idea about who the “wrong people” are, if we are to build computer systems that enforce the confidentiality of our private information, we have to turn this intuition into an actionable policy.

Defining what exactly it means to “handle private information correctly” can be subtle and tricky. This is where programming language techniques can help us, by providing formal semantic models of computer systems within which we can define security policies for private information. That is, we can use formal semantics to precisely characterize what it means for a computer system to handle information in accordance with the security policies associated with sensitive information. Defining security policies is still a difficult task, but using formal semantics allows us to give security policies an unambiguous interpretation and to explicate the subtleties involved in “handling private information correctly.”

In this blog post, we’ll focus on the intuition behind some security policies for private information. We won’t dig deeply into formal semantics for policies, but we provide links to relevant technical papers at the end of the post. Later, we also briefly mention how PL techniques can help enforce policies of the sort we discuss here.

Continue reading

9 Comments

Filed under Semantics, Software Security

SecDev: Bringing Security Innovation Into Design & Development

The IEEE Cybersecurity Development (SecDev) Conference is a new conference focused on designing and building systems to be secure. It will be offered for the first time in Boston, MA, on November 3-4, 2016. This event was conceived, and is being organized, by Rob Cunningham; I’m pleased to be the PC Chair.

As stated in the call for papers, this first iteration of the conference is seeking short (5-page) papers, extended (1-page) abstracts, and tutorial proposals. The submission deadline is June 21, 2016 — if you have new results, old results you’d like to repackage, a tool, a process, a vision, or an idea you’d like to share with those working to make systems more secure, please consider submitting a paper!

This blog post explains why I think we need  this conference, what I expect the first year to look like, and what sort of papers we hope to get, in question & answer format. Continue reading

Leave a Comment

Filed under Process, Research, Software Security

Interview with Matt Might, Part 2

Matt Might at the White House, Jan 2015

Matt at the White House, Jan 2015

This post is the second part of my March 10th interview of Matt Might, a PL researcher and Associate Professor in the Department of Computer Science at the University of Utah.

In Part I, we talked about Matt’s academic background, his PL research (including his favorite among the papers he’s written), and his work on understanding and treating rare disease, which began with the quest to diagnose his son Bertrand, and has led to a role in the President’s Initiative on Precision Medicine.

In this post, our conversation continues, covering the topics of blogging, privacy, managing a crazy schedule, and looking ahead to promising PL research directions. Continue reading

Leave a Comment

Filed under Bioinformatics, Interviews, Language adoption, Probabilistic programming, Program Analysis, Scientists, Software Security, Types

DARPA STAC: Challenge-driven Cybersecurity Research

Last week I attended a multi-day meeting for the DARPA STAC program; I am the PI of a UMD-led team. STAC supports research to develop “Space/time Analysis for Cybersecurity.” More precisely, the goal is to develop tools that can analyze software to find exploitable side channels or denial-of-service attacks involving space usage or running time.

In general, DARPA programs focus on a very specific problem, and so are different from the NSF style of funded research that I’m used to, in which the problem, solution, and evaluation approach are proposed by each investigator. One of STAC’s noteworthy features is its use of engagements, during which research teams use their tools to find vulnerabilities in challenge problems produced by an independent red team. Our first engagement was last week, and I found the experience very compelling. I think that both the NSF style and the DARPA style have benefits, and it’s great that both styles are available.

This post talks about my experience with STAC so far. I discuss the interesting PL research challenges the program presents, the use of engagements, and the opportunities STAC’s organizational structure offers, when done right.

Continue reading

1 Comment

Filed under Process, Program Analysis, Research, Science, Software Security

Software Security Ideas Ahead of Their Time

[This post was conceived and co-authored by Andrew Ruef, Ph.D. student at the University of Maryland, working with me. –Mike]

As researchers, we are often asked to look into a crystal ball. We try to anticipate future problems so that work we begin now will help address those problems before they become acute. Sometimes, a researcher guesses the problem and its possible solution, but chooses not to pursue it. In a sense, she has found, and discarded, an idea ahead of its time.

Recently, a friend of Andrew’s pointed him to a 20-year-old email exchange on the “firewalls” mailing list that blithely suggests, and discards, problems and solutions that are now quite relevant, and on the cutting edge of software security research. The situation is both entertaining and instructive, especially in that the ideas are quite squarely in the domain of programming languages research, but were not considered by PL researchers at the time (as far as we know).

Continue reading

18 Comments

Filed under PL in practice, Research, Research directions, Software Security

From ‘Penetrate and Patch’ to ‘Building Security In’

This year I was pleased to be named one of U. Maryland’s Distinguished Scholar-Teachers (DSTs). This recognition, awarded to a few UMD faculty each year, is given to those who have shown success both in teaching and research. I put a lot of energy into both of these activities, so it was a great feeling to be recognized as a DST.

hicks-dst-talkOne of the consequences of accepting the award is that you must give a lecture about your research/interests to a general audience. I gave my talk, titled From ‘Penetrate and Patch’ to ‘Building Security In’, on Monday.

My Department Chair, Samir Khuller, a DST himself, told me that I should aim the talk for an eighth grade level, i.e., an audience with only a cursory understanding of computer science. But of course it’s not quite that simple: only some people who attend will be at that level; many who attend will have a stronger background because they will be interested in the topic. So as I was preparing my talk last week I tried to make it so the generalists would not get lost, and the specialists would not get bored.

The point of my talk is that our cybersecurity woes are often (but not always) due to vulnerable software. While firewalls, anti-virus, and other security products stem the tide of attacks, these products are not addressing the root problem. Once software vulnerabilties are discovered they can be patched, but this “penetrate and patch” approach is not working: unpatched systems remain vulnerable, and even when they are the patched there are probably other latent vulnerabilities that remain. “Penetrate and patch” also doesn’t address the new vulnerabilities that are introduced as the software evolves.

So we need shift our mentality to building security in: We should aim to build software that is free of vulnerabilities (or far more likely to be free of them) right from the start.

640px-Building_bridges,_Fuling_Wujiang_Bridge

To get this idea across to a general audience I used bridge-building as a motivation: We use the best designs, methods, and tools to build bridges that stand up to heavy use and extreme conditions. Then I talked about what software is — basically how it works — and how some software bugs can be exploited to deleterious effect. I showed, at least at a high level, how a buffer overflow works. Then I showed how language design and other PL-style research products are analogous to the best tools and methods of bridge-building, and can therefore help us avoid buffer overflows and other problems. I also described how — through my coursera software security class and the build-it, break-it, fix-it contest[ref]The next iteration of the contest starts Thursday, October 1 — not too late to sign up![/ref] — I am trying to encourage this mentality of building secure software from the start, not just leaving security to the last.

I am pretty pleased with how it turned out. Because of having to account for a broad audience, I spent a lot of time on the talk — probably as much as I did on my tenure/promotion talk! My in-laws were in attendance and they told me they understood things pretty well, and that the talk put the trajectory of recent security breaches in perspective.

A link to a video of the talk and slides is here (the proper talk starts at about the 3-minute mark):

https://www.cs.umd.edu/event/2015/09/penetrate-and-patch-building-security

The audio isn’t great, and the slides are a little hard to see (but there’s a link to the PDF), but I think it’s watchable. I’d be very curious for your feedback. I hope you will share the link with friends, tech-savvy or not, who might wonder what this cybersecurity stuff is all about, and how PL research and methods can play a important role in addressing it.

8 Comments

Filed under Education, Software Security