In this post I interview Russ Cox and Sameer Ajmani, who work at Google on the Go programming language. They share with me their path to working on the language, what they find unique and valuable about it, and plans for it going ahead.
This continues our series on PhDs in industry working on programming languages (Avik Chaudhuri was the first). Thanks to Russ and Sameer for taking the time to share their experiences!
What is your academic background?
Russ: I completed bachelors and masters degrees in computer science at Harvard, and then I completed a PhD at MIT, working in the Parallel and Distributed Operating Systems group led by Frans Kaashoek and Robert Morris. When I arrived at MIT, the distributed systems community had started focusing on peer-to-peer networks, and so I spent a few years on related topics. For my PhD thesis I looked into the problem of how to design an extension mechanism for compilers so that extension plugins could reasonably coexist and be composed.
Sameer: I completed by bachelors in Computer Science at Cornell during 1994-1998, and completed a Masters & PhD at MIT between 1998-2004. My advisor at MIT was Barbara Liskov. My Masters research considered A Trusted Execution Platform for Multiparty Computation, and for my PhD I worked on Automatic Software Upgrades for Distributed Systems. I joined Google in 2004, and the Go team in 2012.
Russ: I grew up near Bell Labs, and during high school and college I had the good fortune to be able to spend time there hanging out in the computer science department, the birthplace of Unix and C but also lex, yacc, and the Dragon Book. At some level, knowing how to put together a language (little or big) and compiler was just part of the culture. That almost certainly led to my interest in making nicer programming environments for myself.
What is the origin story for Go, and your involvement in its development?
The first problem is that networked systems are getting larger, with more interaction between more pieces, and existing languages were not making those interactions as smooth to express as they could be. The solution was to adopt the model of Hoare’s Communicating Sequential Processes (CSP). That may sound risky, but Rob and Ken had experience with a sequence of languages done at Bell Labs that used CSP to good effect.
The second problem is that programs are getting larger and development more distributed. A language that works very well for a small group may not work as well for a large company with thousands of engineers working on many millions of lines of code. When you’ve got that many engineers working together, you want to make sure the language doesn’t have dusty corners that only a few people know how to use well. A code base that size is so large that even maintenance requires mechanical help, and existing languages weren’t designed with that in mind. Even mundane issues like how many files must be read in order to build a program matter at that scale. All of those considerations, and more, led to the idea of hosting CSP in a new language instead of trying to bolt it onto an existing one.
I was finishing my PhD in spring 2008 and visited Google. I had worked with Rob at Bell Labs, and both he and Ken told me about Go, and I was hooked. When I joined the team in August, the language was still just a prototype, with almost no library. I took over the compiler and runtime, and I got to help to develop the standard library and all the revisions and refinements to the language prompted by that experience. Today, Rob and I lead the overall Go project at Google together.
Sameer: In early 2011, I attended a workshop given by Rob Pike about Go at Google NYC and been particularly impressed with the language’s concurrency features. I was working on the indexing pipeline for Google Maps at the time and writing code that dealt with replicated storage systems. I had a library in C++ to issue storage operations to F+1 replicas and fail over to additional replicas as operations failed or timed out; it was about 700 lines. I converted this code to Go and it was under 100 lines and far more comprehensible. I was sold. I started spending 20% of my time improving the Go’s libraries within Google. The team noticed, and in Fall of 2011 invited me to join Go full time. I joined the team in January 2012. At that point, the language was almost finalized; Go 1.0 was released March 28, 2012. My role was to make Go useful for building production systems inside Google.
In your view, what are some key things that distinguish Go from prior languages?
Russ: The most obvious thing that distinguishes Go is the focus on CSP, with lightweight threads of control (we call them goroutines) communicating by passing messages on channels. That was not a common model at all when Go launched. Erlang is the closest, but even in Erlang there are no explicit channels (it is more like the original CSP paper than Hoare’s followup work).
Sameer: Most mainstream languages provide concurrency support using libraries. Having it built into the language makes concurrent code easier to read and write (for example, see the use of closures in the Digesting a tree example), and the compiler and runtime (scheduler, GC) can make concurrent programs execute well.
Relatedly: What kind of application is a great fit for Go, but may not be for other languages?
Sameer: Go is a great fit for concurrent programs, especially servers. Request handlers in servers are mostly independent and so are naturally expressed as separate threads of execution. Go provides goroutines, which are extremely lightweight threads (4KB stacks that grow as needed). It’s common to have programs with hundreds of thousands of goroutines. Go provides channels to allow goroutines to communicate (synchronize and exchange data values) and a select statement to allow goroutines to wait on multiple communication events.
Russ: I agree with Sameer: Because of the strong support for concurrency, I think Go is a great fit for network clients or servers that are dealing with many different input sources or other events happening all at once. And because of the focus on software engineering scale, I also think Go is a great fit for any program that’s going to be worked on by more than a handful of engineers or grow to more than a few thousand lines of code. Now that the Go compiler has been moved from C to Go, I can finally say that I use Go for essentially all my day-to-day programming. It’s wonderful.
How has the research literature in programming languages influenced Go’s design (both positively and negatively), if at all?
Go is more an engineering project than a pure research project. Like most engineering, it is fundamentally conservative, using ideas that are proven and well understood and will work well together. The research literature’s influence comes mainly through experience with its application in earlier languages. For example, the experience with CSP applied in a handful of earlier languages—Promela, Squeak, Newsqueak, Alef, Limbo, even Concurrent ML—was just as crucial as Hoare’s original paper to bringing that idea to practice in Go. Programming language researchers are sometimes disappointed that Go hasn’t picked up more of the recent ideas from the literature, but those ideas simply haven’t had time to pass through the filter of practical experience.
What are some of the features of Go that are important in practice but undervalued in the research literature?
Russ: I think programming language researchers sometimes underestimate the practical importance of simplicity and predictability in a language feature, especially in incorrect programs.
To take one example, Hindley-Milner type inference is very well understood and I think generally accepted in the programming language community as a good feature. But new ML programmers inevitably hit the situation where changing one part of a program causes a mysterious type error in a seemingly unrelated part. The solution is that in practice one writes type signatures for most functions. Hindley-Milner type inference is beautiful research, but it’s not as beautiful in practice. In Go, the only type inference is that if you say var x = e, the type of x comes from e. This rule is simple and predictable and doesn’t suffer from the kinds of “spooky action at a distance” of more complex type inference algorithms. I miss having more complex inference sometimes, but Go’s simple rule handles at least 90% of the cases you care about, with none of the unpredictability or mystery.
Sameer: The focus on software engineering aspects is just as important and often overlooked. For example, Go is amenable to machine transformation. We’ve used this to enforce automatic formatting and automatically update the import lines for a source file. As a result, Go programmers just write their code, then a tool updates their import lines as needed to pull in whatever packages are needed. We are working on new tools to automatically simplify code and change function signatures to plumb request-scoped data. These features allow Go to scale to large code base sizes and large teams and allows us to continue improving the quality of existing code even after the developers have moved on to other projects.
What’s the current state of the language, and what are some short term and long term goals?
Russ: The language is basically done for now. We’ve committed to stability and backward compatibility as language features.
Our short-term focus is on improving the implementation and making Go run in more places. We recently converted the compiler from C to Go (mechanically), and now we’re starting to think about adding SSA-based optimizations. In the runtime, the main focus is on implementing a concurrent garbage collector with bounded pauses. We’re also looking at making Go work for all the places that people run code today, from networked servers to mobile devices and everything in between.
In the long term, we want to make sure that Go continues to be a stable, reliable platform for people to get work done. If we keep doing that, we should keep attracting new programmers and growing the community.
What is your view of the value of a PhD when working at a company like Google? Why should people pursue a PhD if they are not going to end up at a University or research lab?
Sameer: Companies like Google value individuals who are self-directed and, in particular, who can think carefully about an underspecified problem, do independent research to find a solution, then create and execute a plan to implement that solution (usually using a team of people). The process of working through a PhD demonstrates many of these skills, but not all of them. In particular, PhD students many not know how to divide up a plan and delegate to teammates, and they might not know how to take a program that works well for small test cases and make it work well at scale in a production environment. These latter skills are often learned on the job or through side projects.
Russ: Fundamentally, learning how to do research is learning how to identify, develop, evaluate, and present new ideas. Most technology companies start with new ideas, so that’s a great fit. I don’t think it’s a coincidence that there are so many companies started by grad school dropouts.
But the new ideas don’t stop there. Computer hardware improves at such an incredible rate that, especially at a company like Google, software engineers having to be exploring new ideas to keep up and make the best use of that hardware. Much of the software development at Google has a research component, and papers Google has published about what were first and foremost development efforts have nonetheless had significant impact on the research literature. For more about that, I suggest Alfred Spector, Peter Norvig, and Slav Petrov’s paper, Google’s Hybrid Approach to Research. And there is significant research at companies across the technology industry, not just at Google.
It is true, however, that research in the technology industry is inherently focused on ideas with practical applications, which makes it somewhat narrower in scope than academic research. There are positive and negative aspects to that.
What are some key lessons you’ve learned (about anything at all!) while working on Go?
Sameer: A small, highly-skilled team with a few talented designers is better than a large team with lots of opinions. A small, useful language with a few powerful features is better than a large language with lots of features. There’s great value in keeping a language small: it makes code much easier to read, write, and maintain, since there are fewer reasonable ways to accomplish a task.
Russ: The most important thing I learned is that a successful programming language is about far more than the language itself. 1 We spent a lot of time on the language definition and implementation, but we’ve spent even more on making it easy to get started with Go, trying to write good documentation, making sure that the right tools are in place for people to collaborate and share code, and cultivating active but respectful discussion lists for users and developers. I’m particularly grateful to all the excellent developers who have joined us in using and working on Go. The contributions from the open source community have been truly amazing!