Google Bigtable. This article gives an overview of Bigtable the Google solution for storing data.

1. Bigtable

Google uses as a data storage a facility called Bigtable. Bigtable is a distributed, persistent, multidimensional sorted map. Bigtable is not a relational database. In Bigtable you can store strings under an index which consists out of a row key, a column key and a timestamp. This key points to a uninterpreted array of bytes (string) of size 64 KB.

(row key: type string, column key:type string, timestamp: type int64) → string

The key can get generated by the database or by the application.

For example in the Google Webtable (for Google search) the reverse URL is used as the row key, the column used for different attributes of the webpage and the timestamp indicates from when the data is. The data this key points to is some content from the webpage.

Bigtable is build upon the Google File System and stored in an immutable datastructure called SSTable. The application can define how many entries based on the timestamp should be keep. Alternatively the application can also specify how long entries should be keep. Bigtable will clean-up the obsolete data by deleting the SSTables which only contains irrelevant data using a mark-and-sweep algorithm.

For more information on Bigtable check out the Google Whitepaper for Bigtable.

2. Links and Literature