The basic gist of his talk was that there are too many abstraction layers in a modern database and they don't communicate well enough, hurting possible performance.
Instead of DB : Disk, as it was more in the past, we now have DB : OS : Controller : Disk. And each layer does its own caching, scheduling, etc, often accidentally hurting another layer.
So he started off simply showing how Linux and a single disk reacts (both latency and throughput) in both sequential and random IO for everywhere from 0 to 4k outstanding requests. (using Linux async IO)
The results, as you might imagine, are:
Sequential: constant throughput, regardless of outstanding requests, but increasing latency with more requests
Random: throughput increase with more outstanding requests, also with latency increasing as well. (and quite a bit more)
So there's a sweetspot where you can get better throughput without unacceptable latency. The key is keeping enough outstanding requests such that the operating system's block scheduler (which has a halfway accurate view of the disk geometry), that it can make smart batching decisions.
Next he analyzed MySQL (InnoDB) vs Oracle and showed that InnoDB wasn't aggressive enough at submitting IO to the operating system to keep it busy to always utilize the disk. (often the disk was doing nothing)
So how to improve InnoDB? The places where InnoDB writes:
-- synchronous requests: log writes (sequential) on transaction commit
-- async writes on page writeback.
But, if writeback were more aggresive, it'd increase latency on the synchronous log writes.
In conclusion, he's working to bring back IO priorities, which used to exist in mainframe days (and are in the POSIX async IO interfaces) but are unused in Linux. So he's modifying Linux 2.6's deadline scheduler and making the different async IO priorites have different deadlines, then making InnoDB write more aggresively, but having the async page writeback threads submit low priority IO.
Fucking cool, eh?