Category Archives: Research

by Michael Hicks | June 30, 2021 · 5:00 am

BullFrog: Online Schema Migration, On Demand

I did my PhD on a topic I called dynamic software updating (DSU), a process by which a running application is updated with new functionality, whether to add features or fix bugs, without shutting it down. As a faculty member, I supervised several PhD students on DSU projects. These considered the semantics of DSU and ways of reasoning about and/or testing a dynamic update’s correctness (including while it’s deployed), and ways of implementing DSU, using compilation, libraries, and/or code rewriting. All of this work resulted in what are, as far as I’m aware, still the most full-featured and efficient implementations of DSU for C and Java, to date.

While DSU handles the update of long-running, single-process applications, many long-running applications also involve a database management system (DBMS) to store persistent application data. For example, an online market will have a front end to present the user interface, but the market’s inventory, purchase log, user reviews, etc. will be stored in the back-end database. As such, a single logical change to an application could well involve individual changes to both the front-end code and the contents and format of the database. Maybe our upgraded market now provides access to an item’s price history, which is implemented by extending the DB schema and by adding front-end functionality to query/access this information. To realize this upgrade dynamically, we need to change the application and the database, in one logical step.

In a paper presented at SIGMOD this month, we describe Bullfrog, a new DBMS that supports online schema updates in a way that enables whole-application upgrades. An application change can be applied by DSU for the front-end instances and by Bullfrog for the back-end DB. A key feature of Bullfrog is that on the one hand, the schema change is immediate, which simplifies front-end/back-end coordination of the update, especially when schema changes are backward incompatible. On the other hand, data migration to the new schema is lazy, as the application demands it. Lazy migration avoids a potentially lengthy update-time pause, which would result in loss of availability, defeating the
whole point of DSU. There were lots of challenges to realizing the lazy updating model. I give a flavor of the approach here; the paper has the details. Continue reading →

5 Comments

Filed under Research, Systems

Tagged as database management system, dynamic software updating, online schema migration

by Michael Hicks | July 29, 2020 · 8:00 am

Increasing the Impact of PL Research

[This article is cross-posted on PL Perspectives, the SIGPLAN blog.]

Programming languages research has been going on since (at least) the first general-purpose language compilers were developed in the 1950s and it still has a lot to offer to today’s pressing problems. Indeed, we might right now be in another golden age of programming language design, what with Rust poised to become a major systems programming language; with TypeScript reimagining yet co-existing with JavaScript; with WebAssembly trending towards a high-performant yet safe mobile code platform; and even classic languages like C++ undergoing many major design revisions. Even libraries like TensorFlow and PyTorch and languages like Julia are turning machine learning specialists into language designers and compiler writers.

Even in this vibrant environment, my sense is that the size of the PL research community, and the impact of its research, is lower than it should be. My social network tells me that grad school applications signaling PL interest are declining, and while many researchers have won Turing awards for PL ideas these are becoming fewer and further between. Compare this state of affairs to that in the machine learning and security communities, which are growing rapidly in size and stature. Is there anything the PL community can do to increase the impact of its great work?

My recommendation is a concentrated effort to diversify PL research enthusiasts, and through them broaden the impact of PL-minded work.

We in PL can expand our tent. Education and outreach can help others to see that PL—its problems, methods, and ethos—is different and more exciting than they realized. We can lower the barrier to entry by engaging in a little housecleaning around expectations of core knowledge. We can also venture outside our tent, taking our knowledge and ideas to join other communities and address their problems. All of these steps will follow naturally from a focus on collaborative efforts attacking substantial problems, such as deployable AI or a quantum programming stack, the solution to which involves PL techniques, but many others besides. Continue reading →

3 Comments

Filed under Process, Research

Tagged as collaboration, impact, Publication process

by Michael Hicks | February 4, 2019 · 1:00 pm

“What is PL Research?” The Talk

I was invited to give a talk at the Programming Languages Mentoring Workshop (PLMW) colocated with POPL’19 in January. The talk topic was What is programming languages research? I was excited to give this talk. It’s a topic I’ve thought a lot about over the years; in 2015 I wrote a blog post about it. Shortly thereafter I was elected SIGPLAN Chair, and over the ensuing three years came to know the exciting depth and breadth of the field even more deeply.

Like the blog post, the talk presents what I view as the goals, ethos, and benefits of PL research. Because the PLMW audience is senior undergraduates and early graduate students, the talk also presents an overview of PL as a field. In particular, it presents a tutorial of sorts of the areas and methods that PL researchers often develop and employ. To capture what these are, I skimmed a sampling of the conference proceedings of PLDI and POPL from the last 30 years. Doing so, I abstracted the “shape” of a PL research paper, and identified the broad areas PL researchers tend to focus on. The talk presents a flavor of these areas. Because the talk took place just before POPL, I focused most on topics that appear in POPL-published research; the talk highlights particular POPL’19 papers as examples.

Several people afterward told me that they enjoyed the talk and asked about whether a video of the talk might be available. Unfortunately, the talks were not recorded. SIGPLAN main conference talks are regularly video-recorded, but workshops and co-located events are hit and miss. I certainly understand the financial reasons for this situation. Nevertheless, it’s really too bad that PLMW talks are not recorded. In my experience, PLMW speakers put an exceptional amount of time and care into their talks, so they are often very well done. The talks also target a general audience, so they are potentially valuable to many more people than just those attending the actual event.

In the hopes that others might find it useful, I decided to video-record myself giving the PLMW talk. The recording is not great, but I hope that fact doesn’t get in the way of the conveying the content. If it does, maybe just the slide deck will prove useful. If you have comments or thoughts, I’m glad to hear them!

Many thanks to the organizers of PLMW@POPL’19 for a great event, and the opportunity to speak!

The video. The slides.

12 Comments

Filed under Abstract interpretation, Process, Program Analysis, Research, Semantics, Types

Tagged as PL research

by Michael Hicks | February 10, 2017 · 2:00 pm

Measuring Single vs. Double-blind Reviewing

My colleague Emery Berger recently pointed me to the paper Single versus Double Blind Reviewing at WSDM 2017. This paper describes the results of a controlled experiment to test the impact of hiding authors’ identities during parts of the peer review process. The authors of the experiment—PC Chairs of the 2017 Web Search and Data Mining (WSDM’17) conference—examined the reviewing behavior of two sets of reviewers for the same papers submitted to the conference. They found that author identities were highly significant factors in a reviewer’s decision to recommend the paper be accepted to the conference. Both the fame of an author and the author’s affiliation were influential. Interestingly, whether the paper had a female author or not was not significant in recommendation decisions. [Update: a different look at the data found a penalty for female authors; see addendum to this post.]

Fairness is blind

I find this study very interesting, and incredibly useful. Many people I have talked to have suggested that we scientifically compare single- with double-blind reviewing (SBR vs. DBR, for short). A common idea is to run one version of a conference as DBR and compare its outcomes to a past version of the conference that used SBR. The problem with this approach is that both the papers under review and the people reviewing them would change between conference iterations. These are potentially huge confounding factors. While the WSDM’17 study is not perfect, it gets past some of these big issues.

In the rest of the post I will summarize the details of the WSDM’17 study and offer some thoughts about its strengths and weaknesses. I think we should attempt more studies like this for other conferences. Continue reading →

5 Comments

Filed under Process, Research, Science

Tagged as double-blind review, empirical study

by Michael Hicks | January 26, 2017 · 1:30 pm

POPL’17 in Paris: Some highlights

Last week I attended the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). The event was hosted at Paris 6 which is part of the Sorbonne, University of Paris. It was one of the best POPLs I can remember. The papers are both interesting and informative (you can read them all, for free, from the Open TOC), and the talks I attended were generally of very high quality. (Videos of the talks will be available soon—I will add links to this post.) Also, the attendance hit an all-time high: more than 720 people registered for POPL and/or one of its co-located events.

In this blog post I will highlight a few of my favorite papers at this POPL, as well as the co-located N40AI event, which celebrated 40 years of abstract interpretation. Disclaimer: I do not have time to describe all of the great stuff I saw, and I could only see a fraction of the whole event. So just because I don’t mention something here doesn’t mean it isn’t equally great.[ref]I also attended PLMW just before POPL, and gave a talk. I may discuss that in another blog post.[/ref]

Continue reading →

14 Comments

Filed under Program Analysis, Research, Scientists, Semantics, Software Security, Types

by Michael Hicks | April 26, 2016 · 12:40 pm

SecDev: Bringing Security Innovation Into Design & Development

The IEEE Cybersecurity Development (SecDev) Conference is a new conference focused on designing and building systems to be secure. It will be offered for the first time in Boston, MA, on November 3-4, 2016. This event was conceived, and is being organized, by Rob Cunningham; I’m pleased to be the PC Chair.

As stated in the call for papers, this first iteration of the conference is seeking short (5-page) papers, extended (1-page) abstracts, and tutorial proposals. The submission deadline is June 21, 2016 — if you have new results, old results you’d like to repackage, a tool, a process, a vision, or an idea you’d like to share with those working to make systems more secure, please consider submitting a paper!

This blog post explains why I think we need this conference, what I expect the first year to look like, and what sort of papers we hope to get, in question & answer format. Continue reading →

Confluences in Programming Languages Research

[This guest post is by David Walker, a professor at Princeton, and recent winner of the SIGPLAN Robin Milner Young Researcher Award. –Mike]

Every once in a while it is useful to take a step back and consider where fruitful new research directions come from. One such place is from the confluence of two independent streams of thought. This is an idea that I picked up from George Varghese, who gave a wonderful talk on the topic at ACM SIGCOMM 2014 and summarized the ideas in a short paper for CCR.[ref]George Varghese. Life in the Fast Lane: Viewed from the Confluence Lens. ACM SIGCOMM Computer Communication Review 45 (1), pp 19-25, January 2015. (link)[/ref] This blog post considers confluences in the context of programming languages research, reflects upon the role such confluences have played in my own research, and suggests some things we might learn from the process. My keynote talk from POPL 2016[ref]David Walker. Confluences in Programming Languages Research (Keynote). ACM SIGPLAN Symposium on Principles of Programming Languages. pp. 4-4, January 2016. (abstract, video, slides)[/ref] touches on many of these same themes.

Continue reading →

DARPA STAC: Challenge-driven Cybersecurity Research

Last week I attended a multi-day meeting for the DARPA STAC program; I am the PI of a UMD-led team. STAC supports research to develop “Space/time Analysis for Cybersecurity.” More precisely, the goal is to develop tools that can analyze software to find exploitable side channels or denial-of-service attacks involving space usage or running time.

In general, DARPA programs focus on a very specific problem, and so are different from the NSF style of funded research that I’m used to, in which the problem, solution, and evaluation approach are proposed by each investigator. One of STAC’s noteworthy features is its use of engagements, during which research teams use their tools to find vulnerabilities in challenge problems produced by an independent red team. Our first engagement was last week, and I found the experience very compelling. I think that both the NSF style and the DARPA style have benefits, and it’s great that both styles are available.

This post talks about my experience with STAC so far. I discuss the interesting PL research challenges the program presents, the use of engagements, and the opportunities STAC’s organizational structure offers, when done right.

Continue reading →

1 Comment

Filed under Process, Program Analysis, Research, Science, Software Security

Tagged as complexity attack, side channel, timing channel

by Michael Hicks | February 1, 2016 · 2:00 pm

Software Security Ideas Ahead of Their Time

[This post was conceived and co-authored by Andrew Ruef, Ph.D. student at the University of Maryland, working with me. –Mike]

As researchers, we are often asked to look into a crystal ball. We try to anticipate future problems so that work we begin now will help address those problems before they become acute. Sometimes, a researcher guesses the problem and its possible solution, but chooses not to pursue it. In a sense, she has found, and discarded, an idea ahead of its time.

Recently, a friend of Andrew’s pointed him to a 20-year-old email exchange on the “firewalls” mailing list that blithely suggests, and discards, problems and solutions that are now quite relevant, and on the cutting edge of software security research. The situation is both entertaining and instructive, especially in that the ideas are quite squarely in the domain of programming languages research, but were not considered by PL researchers at the time (as far as we know).

Continue reading →

18 Comments

Filed under PL in practice, Research, Research directions, Software Security

Tagged as ASLR, bounds checking, buffer overflow, CFI, randomization, software diversity, stack canary

by Michael Hicks | November 5, 2015 · 2:00 pm

Promoting Research Quality

Consider this claim

Quality is more important than quantity

I expect few people would disagree with it, and yet we do not always act as if it were true. In Academia, when considering candidates to hire or promote, we count their papers, their citations, their funding, their software download rates, their graduated students, the number of their committee memberships or journal editorships, and more.

Researchers are getting the message: quantity matters. Ugo Bardi proposes the economic underpinnings of this apparent trend, cleverly arguing that scientific papers are currency, subject to phenomena like inflation (more papers!), assaying (peer review validates papers, which support funding proposals, which finance more papers), and counterfeiting (papers published without review by unscrupulous publishers). Moshe Vardi, in a recent blog post, concurs that “we have slid down the slippery path of using quantity as a proxy for quality” and that “the inflationary pressure to publish more and more encourages speed and brevity, rather than careful scholarship.”[ref]Update 8/21/2016: As more evidence of the problem, here’s a great retrospective from the editor of a top journal in sociology points to quantity greatly devaluing quality.[/ref]

In this post I consider the problem of incentivizing, and assessing, research quality, starting with a recent set of guidelines put out by the CRA. I conclude with a set of questions—I hope you will share your opinion. Continue reading →

24 Comments

Filed under Process, Research, Science

Tagged as citations, CRA, research quality

Category Archives: Research

BullFrog: Online Schema Migration, On Demand

Increasing the Impact of PL Research

“What is PL Research?” The Talk

Measuring Single vs. Double-blind Reviewing

POPL’17 in Paris: Some highlights

SecDev: Bringing Security Innovation Into Design & Development

Confluences in Programming Languages Research

DARPA STAC: Challenge-driven Cybersecurity Research

Software Security Ideas Ahead of Their Time

Promoting Research Quality

Recent Posts

Subscribe to Blog via Email

Recent Comments

Archives

Categories

Wordpress.org

Category Archives: Research

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Recent Posts

Subscribe to Blog via Email

Recent Comments

Archives

Categories

Wordpress.org