July 16th, 2004


Working at Google Labs

There was an ad in the latest Linux Journal for GoogleLabs, with a puzzle involving a vending machine. The bottom of the page said if you submit the answer with a resume, your resume goes to the front of the queue. I solved it with a few lines of Perl and submitted the answer that day:
Date: Fri, 16 Jul 2004 11:54:34 -0700
From: labjobs@google.com
To: Brad Fitzpatrick <brad@danga.com>
Subject: Re: [#12106644] vending machine problem

Dear Problem Solver,

Thanks for taking the time to play with our little brain-teaser and for sending us your solution. While you won't find a confirmation of the answer in this email, we did want to let you know that your message was received and that we'll be taking a look at it. If you submitted a resume, we'll be spending some time with that as well. If it seems like there might be a fit with a position we have open, we'll contact you within the next few days.

And as for the answer to the puzzle, we'll post the ads and the solution on our site within a couple of months, after those who are a little slower than you have had a chance to work on it a bit longer. Thanks again for your interest in Google.

The Google Labs Engineering Team
Heh... too bad I didn't send them a resume.

MogileFS and FotoBilder

xb95 is a machine. I outlined how I wanted MogileFS's HTTP transport support to work and he chugged through it all week and it's pretty much complete.

Check out these paths for different actions:

First, some definitions:

Client: web browser or fotobilder client
Perlbal: front-end reverse proxy load balancer & web server
mod_perl: application server
Mogile Tracker: daemons that handles MogileFS "where is this file?" or "where can I put this file?" requests
Mogile Database: the database cluster that holds the Mogile Tracker info (will be MySQL Cluster in production)
mogstored: MogileFS's HTTP interface to the actual files. this is actually just a tiny, tiny wrapper around Perlbal. So it's actually Perlbal, with a static configuration (web server mode, allows GET, HEAD, PUT, and DELETE requests).

Uploading a file:

1: client sends to perlbal
2: perlbal sends to a mod_perl
3: mod_perl to app database
4: app database to mod_perl
5: mod_perl asks tracker where to put it
6: tracker contacts database cluster
7: database cluster responds
8: tracker replies to mod_perl
9: mod_perl does a streaming PUT to mogstored, while streaming read from perlbal (while that does a streaming read from client, sans 250k buffer or so)
10: mogstored replies HTTP OK once file is safely on disk
11: mod_perl sends fotobilder protocol OK or HTTP html okay to perlbal
12: perlbal to client

Downloading a file: (Picture)

1: client to perlbal
2: perlbal to mod_perl
3: mod_perl to app database
4: app database to mod_perl
5: mod_perl to tracker
6: tracker to database
7: database to tracker
8: tracker to mod_perl (with 1-3 internal URLs to the resource)
9: mod_perl to perlbal (with 1-3 internal URLs to resource)
10: perlbal tries 1-3 internal URLs (mogstored) until it finds somebody alive.
11: mogstored streaming to perlbal
12: perlbal streaming to client

Everything is redundant, and we never lock up mod_perls which are memory-heavy and need to do hard work. Wasting their time pushing bytes around is useless.