Making memory and databases work together - it's a tough ask

There's a debate going on about the best way to handle databases in the cloud - could we end up going down the wrong path?

In-memory database analytics is hot topic right now. This statement doesn't need an 'arguably' or an 'allegedly' caveat to preface it, we can take this to be the gospel truth.

So then, the question of whether in-memory database analytics technology (usually targeted at big data sets) can migrate to the cloud, still integrate with a level of disk-based systems and be able to compute alongside a variety of next-generation computing methodologies (from Agile programming to parallelism and concurrency) is the question on everybody's lips.

OK if it's not on your lips, them why isn't it? In-memory meets multi-core processing with cloud and on premise disk integration challenges... what's not to like?

But will all these elements gel together or will there always be a distinction (and therefore a choice to be made) based on what types of data processing we need to execute and where? Will more traditional DataBase-as-a-Service (DBaaS) layers fail to integrate productively with on-premise in-memory database engines – or, even, in-memory in the cloud?

The potential disconnect here results if we see the cloud DBaaS vendors go ploughing ahead with their offerings without considering every data type and every data processing need inside this equation.

Consider ObjectRocket then. The firm recently launched a MongoDB database-as-a-service specifically architected so that each instance is backed by a pure solid-state disk. This (says the company) is in order to cater for “massive I/O” and it resides on multiple redundant pieces of infrastructure.

So that’s massive I/O, as a service, in the cloud, from disk (albeit solid state and therefore with less latency than traditional disks) but not as an in-memory proposition then.

An optional existence 
In database land this is known as “database sharding” where a data sets are broken down into separate chunks and dispatched off to reside on individual distributed servers. Databases here are replicated and optionally exist in multiple geo-diverse datacentres. In the case of ObjectRocket, the firm uses AWS Direct Connect to ensure low latency and free bandwidth exists for AWS customers.

But as much as these solutions provide the much-lauded benefits of so-called “effortless scaling” and useful efficiency services such as automatic detection of long running queries, there appears to be confusion (if not consternation) as to what form of database we should use where and for what.

Specifically here I am talking about the choice we can now make with regard to whether we place OnLine Analytical Processing (OLAP) into in-memory environments for magnificent speed and/or whether we keep OnLine Transactional Processing (OLTP) closer to disk based systems, sharded or otherwise.

In-memory processing is arguably the precision engineering driven formula of choice when presented on the right chipset architecture and directed at big data analytics. But memory is still a lot of more expensive than disk, so where transactional workloads (or more mundane tasks like hosting a portal) are concerned, the option to compute more cost efficiently on disk remains attractive.

Quite apart from the above trade-off, we then need to analyse which elements of database workload we place in the cloud (public or private).

As we know, SAP is pushing hard for in-memory adoption across the board and says that both OLAP and OLTP have a place at the table when it comes to the HANA platform. ObjectRocket on the other hand is also all about speed (they put the word “rocket” in the title remember) and the firm’s technology is engineered to isolate core CPU, memory and I/O so that individual users’ performance is never impacted, even during levels of peak network activity.

We could sit on the fence here and say that it’s early days and the customers and their adoption curves will tell us more in eighteen months time. But that may be too late if the wrong data types have been directed at the wrong processing engines.

Cloud database processing is supposed to be getting easier, but it may feel like it’s more complicated in the short term.

Read more about: