Hot data needs kid cloud oven gloves

It's an interesting metaphor, but should we really talking about data in terms of temperature?

It’s like Complex Event Processing (CEP) and Artificial Intelligence (AI) never happened. Yes we’ve had fast moving sensitive data before, but this is something deeper and somewhat closer to the motherboard by all accounts.

What am I talking about? It’s the rise of vendor chatter (or VC if you like the acronym suggestion) relating to real time data analytics in the cloud. Almost always followed by the term “operational insight” and often allied to the benefits of in-memory processing as they are, we now start to hear companies referring to data at different temperature grades.

But this is still early days: a web search for the term “hot data” will mostly provide you with a list of hot topics, top tens or indeed something less savoury.

So what do we mean by hot data and how does it differ from warm and cold data?

Yes, hot data is current, perhaps fewer than 30 days, or even less than 24 hours old depending on the defined environment. But it goes further than that, as hot data management solutions should also be able to determine whether secondary levels of analysis are required based upon the first round of sampling and “insight” (oh dear, there’s that word again) carried out.

Achieving this live hot data Analytics-as-a- Service in the public cloud requires two things.

  1. First, we must have the programming intelligence to be able to understand what metrics we need to place upon our hot data ie what frequency and percentage change should we be analysing for and what actions should we take when data changes by the prescribed amount. We have this know-how, but it’s far from a given.
  1. Secondly, we need the integration link. We have tools to run analytics in the public cloud, so this is great news. We also have tools to move selected chunks of hot data into the cloud when we want to. But (and this is a view put forward by Nicos Vekiarides who is CEO of data protection company TwinStrata) we do not yet have the ability to deliver both as an integrated service.

But this integration is coming, maybe as soon as 2013. For developers, this means a good deal of hand dirtying tinkering with workload definitions, architecting of metrics and a healthy does of ‘buffer pool management’ as we move (once again) to the in-memory analytics side of the street.

For chief information officers (if we get this process right) this means that the duck appears serenely on the surface… while the legs truly are thrashing away beneath the water. CIOs can then not only get the insight that they crave, there is also a chance to correlate contextual information across network paths (inside any given system installation) and use the resulting data to address issues such as application vulnerability management.

Indeed, you wouldn’t expect a dangerous sounding term like hot data not to be hazardous would you?

Skybox Security VP Justin Coker now predicts that we will see a “dramatic expansion of the attack surface” in this new world of big data, mobile data and, above all, hot data manipulation.

“[We need] to take a big data approach to security assessment – collecting huge amounts of data and applying new predictive analysis tools to identify risks and breach traces in real time. In 2013 and later years, this approach will become more methodological.

The question today may hinge on who takes the responsibility for handling hot data if does exist. Should it be the development team leader, the IT asset manager, the chief security officer, the lead database administrator or some contextual analysis specialist?

Whoever it is, they need to wear kid cloud oven gloves and we must avoid a too many cooks scenario at all costs.

Read more about: