My colleague Emery Berger recently pointed me to the paper Single versus Double Blind Reviewing at WSDM 2017. This paper describes the results of a controlled experiment to test the impact of hiding authors’ identities during parts of the peer review process. The authors of the experiment—PC Chairs of the 2017 Web Search and Data Mining (WSDM’17) conference—examined the reviewing behavior of two sets of reviewers for the same papers submitted to the conference. They found that author identities were highly significant factors in a reviewer’s decision to recommend the paper be accepted to the conference. Both the fame of an author and the author’s affiliation were influential. Interestingly, whether the paper had a female author or not was not significant in recommendation decisions. [Update: a different look at the data found a penalty for female authors; see addendum to this post.]
Fairness is blind
I find this study very interesting, and incredibly useful. Many people I have talked to have suggested that we scientifically compare single- with double-blind reviewing (SBR vs. DBR, for short). A common idea is to run one version of a conference as DBR and compare its outcomes to a past version of the conference that used SBR. The problem with this approach is that both the papers under review and the people reviewing them would change between conference iterations. These are potentially huge confounding factors. While the WSDM’17 study is not perfect, it gets past some of these big issues.
In the rest of the post I will summarize the details of the WSDM’17 study and offer some thoughts about its strengths and weaknesses. I think we should attempt more studies like this for other conferences. Continue reading
Last week I attended the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). The event was hosted at Paris 6 which is part of the Sorbonne, University of Paris. It was one of the best POPLs I can remember. The papers are both interesting and informative (you can read them all, for free, from the Open TOC), and the talks I attended were generally of very high quality. (Videos of the talks will be available soon—I will add links to this post.) Also, the attendance hit an all-time high: more than 720 people registered for POPL and/or one of its co-located events.
In this blog post I will highlight a few of my favorite papers at this POPL, as well as the co-located N40AI event, which celebrated 40 years of abstract interpretation. Disclaimer: I do not have time to describe all of the great stuff I saw, and I could only see a fraction of the whole event. So just because I don’t mention something here doesn’t mean it isn’t equally great.
The IEEE Cybersecurity Development (SecDev) Conference is a new conference focused on designing and building systems to be secure. It will be offered for the first time in Boston, MA, on November 3-4, 2016. This event was conceived, and is being organized, by Rob Cunningham; I’m pleased to be the PC Chair.
As stated in the call for papers, this first iteration of the conference is seeking short (5-page) papers, extended (1-page) abstracts, and tutorial proposals. The submission deadline is June 21, 2016 — if you have new results, old results you’d like to repackage, a tool, a process, a vision, or an idea you’d like to share with those working to make systems more secure, please consider submitting a paper!
This blog post explains why I think we need this conference, what I expect the first year to look like, and what sort of papers we hope to get, in question & answer format. Continue reading
[This guest post is by David Walker, a professor at Princeton, and recent winner of the SIGPLAN Robin Milner Young Researcher Award. –Mike]
Every once in a while it is useful to take a step back and consider where fruitful new research directions come from. One such place is from the confluence of two independent streams of thought. This is an idea that I picked up from George Varghese, who gave a wonderful talk on the topic at ACM SIGCOMM 2014 and summarized the ideas in a short paper for CCR. This blog post considers confluences in the context of programming languages research, reflects upon the role such confluences have played in my own research, and suggests some things we might learn from the process. My keynote talk from POPL 2016 touches on many of these same themes.
Last week I attended a multi-day meeting for the DARPA STAC program; I am the PI of a UMD-led team. STAC supports research to develop “Space/time Analysis for Cybersecurity.” More precisely, the goal is to develop tools that can analyze software to find exploitable side channels or denial-of-service attacks involving space usage or running time.
In general, DARPA programs focus on a very specific problem, and so are different from the NSF style of funded research that I’m used to, in which the problem, solution, and evaluation approach are proposed by each investigator. One of STAC’s noteworthy features is its use of engagements, during which research teams use their tools to find vulnerabilities in challenge problems produced by an independent red team. Our first engagement was last week, and I found the experience very compelling. I think that both the NSF style and the DARPA style have benefits, and it’s great that both styles are available.
This post talks about my experience with STAC so far. I discuss the interesting PL research challenges the program presents, the use of engagements, and the opportunities STAC’s organizational structure offers, when done right.
[This post was conceived and co-authored by Andrew Ruef, Ph.D. student at the University of Maryland, working with me. –Mike]
As researchers, we are often asked to look into a crystal ball. We try to anticipate future problems so that work we begin now will help address those problems before they become acute. Sometimes, a researcher guesses the problem and its possible solution, but chooses not to pursue it. In a sense, she has found, and discarded, an idea ahead of its time.
Recently, a friend of Andrew’s pointed him to a 20-year-old email exchange on the “firewalls” mailing list that blithely suggests, and discards, problems and solutions that are now quite relevant, and on the cutting edge of software security research. The situation is both entertaining and instructive, especially in that the ideas are quite squarely in the domain of programming languages research, but were not considered by PL researchers at the time (as far as we know).
Consider this claim
Quality is more important than quantity
I expect few people would disagree with it, and yet we do not always act as if it were true. In Academia, when considering candidates to hire or promote, we count their papers, their citations, their funding, their software download rates, their graduated students, the number of their committee memberships or journal editorships, and more.
Researchers are getting the message: quantity matters. Ugo Bardi proposes the economic underpinnings of this apparent trend, cleverly arguing that scientific papers are currency, subject to phenomena like inflation (more papers!), assaying (peer review validates papers, which support funding proposals, which finance more papers), and counterfeiting (papers published without review by unscrupulous publishers). Moshe Vardi, in a recent blog post, concurs that “we have slid down the slippery path of using quantity as a proxy for quality” and that “the inflationary pressure to publish more and more encourages speed and brevity, rather than careful scholarship.”
In this post I consider the problem of incentivizing, and assessing, research quality, starting with a recent set of guidelines put out by the CRA. I conclude with a set of questions—I hope you will share your opinion. Continue reading
If you are in the world of programming languages research, the announcement that UW had hired Ras Bodik away from Berkeley was big news. Quoting UW’s announcement:
Ras’s arrival creates a truly world-class programming languages group in UW CSE that crosses into systems, databases, security, architecture, and other areas. Ras joins recent hires Emina Torlak, Alvin Cheung, Xi Wang, and Zach Tatlock, and senior faculty members Dan Grossman and Mike Ernst.
And there’s also Luis Ceze, a regular publisher at PLDI, who ought to be considered as part of this group. With him, UW CSE has 8 out of 54 faculty with strong ties to PL. Hiring five PL-oriented faculty in three years, thus making PL a significant fraction of the faculty’s expertise, is (highly) atypical. What motivated UW CSE in its decision-making? I don’t know for sure, but I suspect they see that PL-oriented researchers are making huge inroads on important problems, bringing a useful perspective to unlock new results.
In this post, I argue why studying PL (for your PhD, Masters, or just for fun) can be interesting and rewarding, both because of what you will learn, and because of the increasing opportunities that are available, e.g., in terms of impactful research topics and funding for them.
Anyone familiar with American academia will tell you that the US News rankings of academic programs play an outsized role in this world. Among other things, US News ranks graduate programs of computer science, by their strength in the field at large as well as certain specialties. One of these specialties is Programming Languages, the focus of this blog.
The US News rankings are based solely on surveys. Department heads and directors of graduate studies at a couple of hundred universities are asked to assign numerical scores to graduate programs. Departments are ranked by the average score that they receive.
It’s easy to see that much can go wrong with such a methodology. A reputation-based ranking system is an election, and elections are meaningful only when their voters are well-informed. The worry here is that respondents are not necessarily qualified to rank programs in research areas that are not their own. Also, it is plausible that respondents would give higher scores to departments that have high overall prestige or that they are personally familiar with.
In this post, I propose using publication metrics as an input to a well-informed ranking process. Rather than propose a one-size-fits-all ranking, I provide a web application to allow users to compute their own rankings. This approach has limitations, which I discuss in detail, but I believe it’s a reasonable start to a better system.
Last month, Steven J. Vaughan-Nichols of ZDNet alerted us that Linux 4.0 will provide support for “no-reboot patching.” The gist: When a security patch or other critical OS update comes out, you can apply it without rebooting.
While rebootless patching is convenient for everyone, it’s a game changer for some applications. For example, web and cloud hosting services normally require customers to experience some downtime while the OS infrastructure is upgraded; with rebootless patching, upgrades happen seamlessly. Or, imagine upgrades to systems hosting in-memory databases: Right now, you have to checkpoint the DB to stable storage, stop the system, upgrade it, restart it, read the data from stable storage, and restart service. Just the checkpointing and re-reading from disk could take tens of minutes. With rebootless patching, this disruption is avoided; cf. Facebook’s usage of a modified memcached that supports preserving state across updates.
I’m particularly excited by this announcement because I’ve been working on the general problem of updating running software, which I call dynamic software updating (DSU), for nearly 15 years. In this post, co-authored with my PhD student Luís Pina, I take a closer look at the challenge that DSU presents, showing that what Linux will support is still quite far from what we might hope for, but that ideas from the research community promise to get us closer to the ideal, both for operating systems and hopefully for many other applications as well.