MEMSQL - insideBIGDATA

MemSQL combines lock-free data structures and a just-in-time compilation (JIT) to process highly volatile workloads. More specifically, MemSQL implements lock-free hash tables and lock-free skip lists in memory for fast random access to data. SQL queries sent to the MemSQL server are converted into byte code and compiled through LLVM into machine code.^[5] Queries are then stripped of their parameters and the query template is stored as a shared object which is subsequently matched against incoming queries to the system. Executing pre-compiled query plans removes interpretation along hot code paths, providing highly efficient code paths that minimize the number of central processing unit (CPU) instructions required to process SQL statements.

MemSQL is wire-compatible with MySQL.^[6] This means that applications can connect to MemSQL through MySQL clients and drivers, as well as standard Open Database Connectivity (ODBC) and Java Database Connectivity(JDBC) connectors.^[7]

In addition to MySQL syntax and functionality, MemSQL can also store columns in JSON format, and supports Geospatial datatypes and operations.

MemSQL can store database tables either as rowstores or columnstores. The format used is determined by the user at DDL time (i.e. when the table is created). Data for all rowstore tables is stored completely in-memory, with snapshots and transaction logs persisted to disk. Data for all columnstore tables is stored on-disk, using a rowstore-like structure to handle incoming inserts into the columnstore.

Rowstore and columnstore tables differ in more than just the storage medium used. Rowstores, as the name implies, store information in row format, which is the traditional data format used by RDBMS systems. Rowstores are optimized for singleton or small insert, update or delete queries and are most closely associated with OLTP (transactional) use cases. Columnstores are optimized for complex select queries, typically associated withOLAP (analytics) use cases. As an example, a large clinical data set for data analysis is best stored in columnar format, since queries run against it will typically be ad-hoc queries where aggregates are computed over large numbers of similar data items.

Durability for the in-memory rowstore is implemented with a write-ahead log and snapshots, similar to checkpoints. With default settings, as soon as a transaction is acknowledged in memory, the database will asynchronously write the transaction to disk as fast as the disk allows.^[9]

The on-disk columnstore is actually fronted by an in-memory rowstore-like structure (skiplist). This structure has the same durability guarantees as the MemSQL rowstore. Apart from that, the columnstore is durable since its all data is stored on disk.

Address

MemSQL Headquarters

534 Fourth St.
San Francisco, CA 94107

Available White Papers

The Lambda Architecture Simplified

Sponsored Guest Articles

Optimizing Performance and Cost Savings for Elastic on Pure Storage

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Featured RSS Feed

More News from insideHPC