After years of using the MySQL database for its AdWords PPC advertising engine, Google recently replaced MySQL with a new in-house built database called F1. AdWords serves many thousand of users which all share a 100TB database serving up hundreds of thousands of requests per second, and runs SQL queries that scan tens of trillions of data rows per day. In my spare time the other day, I had the pleasure of sitting down to read Google’s research paper (pdf) on F1. The abstract of the paper states:
F1 is a distributed relational database system built at Google to support the AdWords business. F1 is a hybrid database that combines high availability, the scalability of NoSQL systems like Bigtable, and the consistency and usability of traditional SQL databases. F1 is built on Spanner, which provides synchronous cross-datacenter replication and strong consistency. Synchronous replication implies higher commit latency, but we mitigate that latency by using a hierarchical schema model with structured data types and through smart application design. F1 also includes a fully functional distributed SQL query engine and automatic change tracking and publishing.
F1 could have distinct ramifications for database technology in general and big data in particular. I’m expecting that F1 could trigger new database initiatives in the same manner as did NoSQL databases. Although technical issues remain, as with any new technology, the future seems quite upbeat for what F1 can mean for big data and beyond.
Following Google’s invaluable lead with F1 technology we may start to see F1-genre databases arise in the months to come. What if a vendor succeeds with this effort and releases an open source equivalent of F1? What will happen to the NoSQL movement? How will Hadoop technologies be affected? These are questions that will keep getting curiouser and curiouser as time in the big data industry rolls on. I’ll keep a nose to this and let you know as it happens!