I've been working at Google this summer as an intern, which is part of the reason why there haven't been any updates to any of my things. (This blog, MicDroid, etc) They say Google is a place that engineers disappear into, and are not heard from again, and that seems somewhat true for me this summer. The other part of it is simply that my life has become busy, and I've had to make some sacrifices in what I spend my time on.
If there's one thing that's happened fairly frequently, it's that people are interested in what goes on at Google. Today I'll attempt to talk about some of the things I found interesting.
Please note: these are my personal views, they are not meant to be official announcements of any sort, nor are they in any way endorsed by Google.
Let's begin with what I've learned this summer at Google.
Python: My projects this summer were primarily Python based, with a small amount of C++ on the side. I estimate I wrote around 3KLOC worth of Python, and around 500LOC worth of C++. Needless to say it's great to learn another language, and for the Python lessons alone, this summer was worth it. One of the nicer things about learning Python at Google is that "correct" behaviors are enforced by the company style guide as well as code reviews. The best part about Python for my project was that it reduced the logic of the program to data structure manipulation.
Scripting ability: Since my project involves managing a fairly large sized fleet of machines, being able to script fluently using the usual bash tools (awk, grep, sed, etc) is important to successfully doing my job. A related lesson to this is to realize when your scripting has gotten out of control. After a certain level of complexity, perhaps it is better to fold the script functionality into a full-blown program.
Dependencies matter in building large distributed systems: I accidentally triggered a DDoS using the entire fleet of machines because of a set of changes that introduced a dependency on an external service. As it turns out that external service was unable to handle the load of the entire fleet of machines because that was beyond the designed usage of the service. Since then I've been a lot more careful in invoking library functionality that may depend on external services.
Next let's talk about how things are done at Google.
As expected of the engineering-driven culture at Google, the toolchain for working in the main Google source tree is pretty heavily developed.
Google's build system is fast. This blog post series provides a more in-depth look at it, but the short description is that it caches build output across the entirety of the main source tree to avoid having to do unnecessary compilation. Because of this, the large majority of build work done are actually retrieved from cache, leaving the builder very little actual work to do. Rebuilding my summer project typically takes less than 10 seconds.
Code search is very powerful. This blog post introducing public Google Code Search mentions that public Code Search is based on an internal tool that is very widely used. Internal Code Search is several orders of magnitude more powerful than the public variant, so much so that I typically browse through code in Chrome rather than the terminal.
Code reviews are required for any code that goes into the main tree (especially for interns!). Google has actually opened up a variant of their code review tool on Google Code. There is also an article and tech talk by Guido van Rossum available as well. I actually liked Google's tool quite a bit, since I basically tracked my tasks and TODOs through it.
Code style is enforced through code reviews, style guides, and the presubmit process, as mentioned earlier. This leads to code of relatively uniform quality--quite a contrast to many other places I've worked before.
Git at Google. Google is known for being a Perforce company. It turns out there is quite a bit more to the Perforce system at Google than just that though. Also, the engineers have written a Git interface around Perforce, and it works relatively well. This lets me feature branch as much as I want while allowing me to make bugfix branches when necessary. Very useful.
Infrastructure at Google is actually quite interesting too. Having to deal with a lot of machines also equates to having a good amount of tools to deal with them as well.
Google's VM management system is called Ganeti. It's been open-sourced, and the project page details that it's built around Xen/KVM. Ganeti is a layer above Google's vast datacenter resources. There's a fairly well developed set of tools around making Ganeti programmable as well.
Building and deploying to our machines is done through a combination of a constant integration system that produces debs, APT, Puppet, and Slack.
A good sysadmin needs to not only know the system inside and out, but needs to be able to script and program their tools as well. This is something I really need to work on.
In addition to being known for engineering prowess, Google is also known for being an interesting place to work.
It's typically said the company is opaque to those on the outside, but transparent to those on the inside. Engineers are given access to almost everything, including source, documentation, information (including non-engineering related info), etc about nearly all Google projects. Additionally, there's no particular embargo on talking to engineers working on other projects, so it's not uncommon at all to hear a lot of details about other projects over lunch. There's just a general openness regarding information inside the company.
There's just as much complaining about Google products inside the company as there is outside. Especially with the recent launch of Google+ and the inconsistent real name policy that accompanied it, there has been just as much debate inside the company as there has been outside. So to those of you who have fallen victim to the real name policy, we feel you.
Along with the openness of information, internal public discourse about what the company is doing has many avenues, from mailing lists to groups in the company, to all company meetings. It's great to be a part of the larger discussion concerning what the company is doing.
Google is a very strong proponent of dogfooding. Google employees can (and do) test early builds of almost any Google product before general release. Along with this comes the discussion of future features and products, of which there is quite a bit going on.
At least for the teams I was working with it felt like Google was very bottom-up, where we work on a small area of the team mission each day, rather than take direction on a specific product or feature. I'm not sure if this is due to my team not being a strict product team, but it sure does feel like each day you get to choose to devote your time to something that furthers the team mission. Personal ownership of what you're working on is also pretty high, at least in the team that I work with. It's definitely not on the same level as a startup, but it's quite a bit nicer than some of the other places I've worked at. Word across the internet says things are changing across Google in this regard though, but it didn't affect the teams I was in touch with.
There is a HUGE amount of information inside Google. This information is about nearly everything Google related, from product information, code documentation, tutorials, internal tools, what's for lunch, parking, traveling, personal projects, etc. Not all of it may be up to date, but the information is definitely there. One could waste days just learning about Google history, how Google works, etc by reading documentation and sites inside.
There is a slide in one of the Google Mountain View buildings. A slide. I couldn't find it. Maybe next time.
I'm definitely going to miss working at Google (and not just for the free food!).
I am going to miss the (relatively) consistent code style. Having worked at several places where there was no real style guide, being able to actually READ other people's code was quite pleasant.
The tools (those mentioned above, and some that weren't) make life quite a bit easier, especially the code search. I am seriously going to miss how useful the code search system was. In fact I'm already thinking about a possible system that hooks into Git repositories...
Working on a large distributed system is something I haven't done before (the systems at Supercomputer Center were quite a bit smaller). It's got it's own unique issues, but overall it is quite fun to be able to work on.
It's easy to find information at Google (who would have guessed?). Even if what you find is not current, at least it gives you a starting point to get more information. Additionally, there is a LOT of really fascinating information lying around internally. I just wish I had more time to ingest more of it.
There are a few things that I did find irritating while working at Google though. Here are some.
Some common usages of infrastructure require approval. This takes time away from useful implementation time and slows down the overall pace of work. I can understand why they require approval, but sometimes it seems rather arbitrary.
Not all the information is up to date. It's very easy to find old documentation, or deprecated systems. Sometimes there isn't even new documentation yet. This can also make it difficult to find the RIGHT information. Usually the solution here is to ask, but it nothing breaks up your workflow like finding out you were using a deprecated or undesired feature.
Not Invented Here syndrome is supposedly one of the big problems Google faces, however I didn't actually run into it too much (or maybe I don't know of enough outside systems to know I'm running into it). I do think that the existing code/build/repository model makes it difficult to integrate 3rd party code however.
Overall, Google has been great to me this summer, and I really think I would enjoy working there in the future (provided this post doesn't disqualify me, oops).