0 0

The Road to AIOps is Built on Intelligent Infrastructure

As technology professionals, it’s no revelation that artificial intelligence (AI) is reshaping our world and all of the ways in which we interact with it.

In the area of infrastructure operations, AI promises to increase agility, reduce overhead, and free us from the burden of watching lights turn from green to amber to red and the feverish tumult of figuring out how to make them green again.

A new and promising application of AI in the datacenter, AIOps, aims to solve these universal challenges.

The premise for AIOps is relatively simple, in theory…

  • We collect all of the log data generated by the various components in our technology stack(s) and we put them all in one place.
  • We can then arrange that data so as to understand it contextually.
  • And then we apply some machine learning techniques to that data so that over time and with a lot of repetition, we can build models or signatures for conditions that threaten those wonderful green lights that we love so much.

Building and then recognizing these signatures will allow us to resolve conditions by (at first) triggering escalations to human operators and eventually, by plugging directly into infrastructure automation frameworks to remediate the conditions directly.

The result will be autonomous resolution of infrastructure issues in the datacenter with minimal human intervention.

As Field CTO at Tintri by DDN, the prospects of the AI-driven datacenter are exciting, and they highlight what I think is a very important gap.

The AI driven datacenter will require smarter infrastructure.

Before we delve into the specific challenges of the AI driven datacenter, I want to contextualize the need for smarter infrastructure by exploring another space where AI is pushing the limits of legacy technology: Autonomous vehicles.

Without diving too deep into all of the facets that make up the autonomous vehicle space, I don’t think it’s a stretch to say that autonomous vehicles are engineered as appliances: software engineered to take advantage of purpose-built hardware and vice-versa.

The result is a tightly coupled marriage through which AI-driven autonomous transportation is possible.

Without this tight relationship between software and purpose-built hardware - the goal quite simply, is unattainable.

For example, as much as you love your 1990 Ford F-150 pickup, no amount of software will allow it to drive itself. Your trusty pickup is old tech and was never engineered with self-driving as a possibility and any attempt to augment its outdated technology would pale in comparison to a purpose-built pickup from…let’s say Tesla.

So…as we return to the datacenter, this example still holds true.

While virtualization has forced compute and networking technologies to abstract and gain the benefit of granular telemetry, insights, and actions - storage has been the laggard.

The dominant storage technologies in the datacenter still revolve around LUNs and volumes - and while they’ve upped their game in instrumentation and telemetry, new frameworks like AIOps require that those insights become directly actionable.

The problem with LUNs is that while the rest of the infrastructure understands what a VM is and can derive data points and impose policies on them, a LUN does not and so it cannot provide the right level of data.  Unfortunately, any insight derived from that data would be incomplete and unactionable.

At Tintri, we say: “If you had a big tent with 49 average joes and 1 Bill Gates and if you could only measure their wealth from the outside, you could only assume there were 50 billionaires in that tent.”.

Limited insight is NOT a pillar on which we can build the AI driven datacenter.

There’s some irony in the fact that software defined compute allowed physical machines to be abstracted into virtual machines, software defined networking abstracted physical ports into virtual ports that allowed networking policies to follow those virtual machines across a virtual fabric…and then software defined storage just lets you build old school LUNs from commodity hardware. No virtual machine context. No better telemetry. No better insights. No better actionability.

In closing, I believe the path forward is clear. AI is coming for the amber and red lights in your datacenter and we’ll need smarter infrastructure to empower this advancement in operational efficiency.

Faster LUNs is not the answer. More telemetry at a LUN level is not the answer. Cloud-based AI to crunch through trillions of LUN-level data points will not be good enough.

We need storage that natively understands what virtual machines are, and also natively understand objects like databases and container volumes and whatever other stateful constructs supporting the enterprise datacenter.

Understanding these objects will empower intelligent infrastructure to react to the needs of the application in real time without the complexity and variance of human operators.

The future demands nothing less.

 

Temporary_css