In this 1-hour lecture (available as a Windows Media or Real Video format streaming video), Jeff Dean talks about Google’s architecture, why they do things the way they do it, and the frameworks they developed for their applications. He also gives a few examples of Google’s current and future projects.
Google is usually pretty secretive about the inner workings of their products; apart from the original papers on PageRank, some engineering papers about the Google File System, little is known about their software. On the other hand, the number of successful projects that came out of google in the last few years is astonishing: Groups, Pictures, Gmail, Blogger, Froogle, Desktop Search, localized search, Scholar, etc. pp.
Furthermore, these are difficult things to do. Research Index has been trying to do for years what Google Scholar is doing now: An exhaustive search aid for scientific papers, finding every single place where a certain paper is published, extracting authors, dates, references etc. Research Index always struggled because of lack of funding and computing power. For Google, it seems to work quite well. (I find myself using Google Scholar more often that Research Index, because it’s more reliable.)
So what is the secret behind Google’s success? I haven’t been able to work that out. They hire top-notch people (the list of papers by Google employees is very impressive), they have enormous amounts of computing resources (though Google never published any numbers, estimates range from 70.000–100.000 CPUs), access to huge datasets (their whole web index, basically), and the employees are encouraged to actually use those resources for their research projects: Every employee gets to devote 20% of his time to research, and apparently it’s not unheard of to requisition a cluster of a couple of thousand CPUs for a research project.
Do I want to work there? Hell yes.
No comments yet.
Leave a comment
Sorry, the comment form is closed at this time.