Alfred Spector Google Inc. azs@google.com
Peter Norvig Google Inc. pnorvig@google.com
Slav Petrov Google Inc. slav@google.com
1
Introduction
In this paper, we describe how we organize Computer Science (CS) research at Google. We focus on how we integrate research and development (R&D) and discuss the benefits and risks of our approach. The challenge in organizing R&D is great because CS is an increasingly broad and diverse field. It combines aspects of mathematical reasoning, engineering methodology, and the empirical approaches of the scientific method. The empirical components are clearly on the upswing, in part because the computer systems we construct have become so large that analytic techniques cannot properly describe their properties, because the systems now dynamically adjust to the hard-to-predict needs of a diverse user community, and because the systems can learn from vast data sets and large numbers of interactive sessions that provide continuous feedback. We have also noted that CS is an expanding sphere, where the core of the field (Theory, Operating Systems, etc.) continues to grow in depth, while the field keeps expanding into neighboring application areas. Research results come not only from universities, but also from companies, large and small. The way that research results are disseminated is also evolving and the peer-reviewed paper is under threat as the dominant dissemination method. Open source releases, standards specifications, data releases, and novel commercial systems that set new standards upon which others then build, are increasingly important. To compare our approach to research with that of other companies is beyond the scope of this paper. But, for reference, we note that in the terminology of Pasteur’s Quadrant [1], we do “use-inspired basic” and “pure applied” (CS) research. [2] and [3] discuss information technology research generally, pointing out the movement in industrial
References: [1] Donald E. Stokes. Pasteur’s Quadrant - Basic Science and Technological Innovation. Brookings Institution Press, 1997. [2] Robert Buderi. Engines of Tomorrow: How The Worlds Best Companies Are Using Their Research Labs To Win The Future. Simon & Schuster, 2000. [3] Mark Dodgson, David Gann, and Ammon Salter. The Management of Technological Innovation: Strategy and Practice. Oxford University Press, 2008. [4] Richard Leifer, Gina OConnor, and Mark Rice. Implementing radical innovation in mature firms: The role of hubs. The Academy of Management Executive, 15, 2001. [5] Ellen Enkel, Oliver Gassmann, and Henry Chesbrough. Open r&d and open innovation: Exploring the phenomenon. R&D Management, 39, 2009. [6] Jakob Uszkoreit, Jay Ponte, Ashok Popat, and Moshe Dubiner. Large scale parallel document mining for machine translation. In Proc. of COLING, 2010. [7] Charles Reis, Adam Barth, and Carlos Pizano. Browser security: Lessons from google chrome. ACM Queue, 7, 2009. [8] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: Simplified data processing on large clusters. In Proc. of OSDI, 2004. [9] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. Google file system. In Proc. of ACM SIGOPS, 2003. [10] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: A distributed storage system for structured data. In Proc. of OSDI, 2006. [11] Johan Schalkwyk, Doug Beeferman, Francoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Garrett, and Brian Strope. ”your word is my command”: Google search by voice: A case study. In Amy Neustein, editor, Advances in Speech Recognition. Springer, 2010. [12] Shumeet Baluja and Michele Covell. Waveprint: Efficient wavelet-based audio fingerprinting. Pattern Recognition, 11, 2008. Additional references can be found at http://research.google.com/pubs/papers.html 5