IDC

UAT
March 2022

End-to-End AI Is Within Reach

Eric Burgener

Eric Burgener

Research Vice President, Infrastructure Systems, Platforms, and Technologies

Peter Rutten

Peter Rutten

Research Vice-President, Infrastructure Systems, Platforms and Technologies Group, Performance-Intensive Computing Solutions Global Research Lead

Sponsored by
IDC: White Paper
Sponsored by: Dell Technologies

Delivering critical insights at scale requires end-to-end, real-time intelligence across an organization and its ecosystem. Data needs to stream freely, and where it naturally settles in a storage environment after having been leveraged for insights, it then needs to be joined by compute to perform more analysis, not the other way round.   

IDC Opinion

Over 90% of enterprises have already embarked on digital transformation — the transition to more data-centric business models. While IT organizations will maintain some legacy workloads when they make this transition, they will also be adding many more next-generation applications that are being developed and deployed specifically to meet the requirements of the new digital era. Big data analytics applications leveraging artificial intelligence (AI) will drive better business insights, fueled by the massive amounts of data that enterprises will be collecting from their products and services, employees, internal operations, and partners going forward. The massive amounts of data required, delivered with an increasingly real-time orientation, demand performance, availability, and scalability that legacy information technology (IT) infrastructure will be hard pressed to meet.

AI is made up of several different workloads, each of which generates a different I/O profile and has different storage requirements. To make the most effective use of AI-driven big data analytics, enterprises will need to create an “end to end” AI strategy that is well integrated across three different deployment models — from edge to core datacenter to cloud. Because of the many new requirements of this hybrid, multicloud strategy, almost 70% of IT organizations will be modernizing their IT infrastructure over the next two years. IDC has released the “artificial intelligence plane” model to help customers better understand how to create the right ecosystem to maximize the contribution AI-driven workloads deliver. The underlying storage infrastructure is a key component in that model, and it is already clear from end-user experiences over the past several years that legacy architectures will generally not provide the right foundation for long-term AI success.

To help its customers succeed with AI, Dell Technologies has put together Dell Technologies Validated Designs for AI. These engineering-validated stacks make it easy for enterprises to buy, deploy, and manage successful AI projects, offering not only the underlying IT infrastructure but also the expertise to create optimized solutions that drive real business value. With its broad IT infrastructure portfolio, including compute, storage and networking resources, and AI ecosystem partnerships, the vendor can bring the right resources together with an end-to-end AI focus that drives competitive differentiation for its customers.

Compute Requirements

Different AI use cases require different compute approaches. For example, both image recognition and natural language processing (NLP) are very compute intensive — both when training the AI model and when putting it into production. A compute approach that is gaining a lot of traction is the convergence of three workloads — data analytics, AI, and modeling and simulation — onto one infrastructure design, which IDC calls performance-intensive computing (PIC). PIC borrows many elements from what is generally known as high-performance computing (HPC), but it is more focused on the fact that in the case of AI, it is necessary to partition the workloads into many smaller chunks and distribute them in a parallelized fashion across compute resources within the server, between servers in the form of clusters, and between clusters across datacenters or clouds.

There are roughly five categories of compute for running AI: workstations, servers (scale-up systems, converged systems, and hyperconverged systems), clusters, supercomputers, and quantum computing. Except for quantum computing (which we will exclude from here due to its nascency), these systems can be accelerated with graphic processing units (GPUs), intelligence processing units (IPUs), field-programmable gate arrays (FPGAs), or application-specific integrated circuits (ASICs) that deliver in-processor parallelization with thousands of cores, or they can be solely dependent on the host CPUs. While most data processing today takes place via server CPUs, there are opportunities to offload some tasks to data processing units (DPUs) or SmartNICs as well.

Data Persistence Requirements

Storage requirements vary depending upon the type of AI workload and there are key storage requirements based on the stage in the AI data pipeline. Each stage in the pipeline has a different I/O profile, and this wide range of storage requirements may drive the need for a different storage system or “silo” for each stage. A more flexible storage platform that can accommodate a wider range of I/O profiles may allow enterprises to consolidate storage workloads in the AI data pipeline on fewer systems — a capability that minimizes data movement requirements and can save considerable costs in both infrastructure and administration. The reader should note that most data that will be driving AI workloads over the next five years will be unstructured, so storage platforms that support file- and/or object-based storage are most relevant.

The AI Plane

To address AI requirements and dimensions, server OEMs, storage OEMs, and cloud service providers have architected ever more sophisticated AI infrastructure stacks. IDC has studied these various efforts and synthesized a formalized AI infrastructure plane from them, also referred to as the “AI plane” (“AIP”). The AIP can serve as a framework for organizations to assess the completeness of their AI environment based on commercial and/or open-source components, particularly if they aim to deploy an end-to-end environment for AI. A generic AIP consists of three layers — the application plane, the data plane, and the control plane:

  • The Application Plane. Below the AI workloads layer are tools that include IDE/workflow tools for easier code and model development (a popular workflow tool is Jupyter); application deployment frameworks for building AI applications, including libraries, SDKs, or reference code; application optimizer tools such as KubeFlow that help deploy, scale, and manage AI models; and model libraries that serve as a marketplace for prebuilt ML models (e.g., NVIDIA GPU Cloud [NGC] or Google ML Marketplace).
  • The Data Plane. This is the infrastructure layer that provides storage resources to enable the AI pipeline. It consists of data management, persistent data access, and data stores/repositories.
  • The Control Plane. This refers to the compute resources that enable the AI pipeline; it includes computing platform selection, compute resources, and software-defined infrastructure.

End-to-End AI

Because of the increasing maturity of the AI plane for all locations — edge, core, and cloud — there is an emerging opportunity to not only develop and deploy AI near the data but also connect these locations with low-latency, high-bandwidth networks and only move data that is used for reads/writes by an AI model across these networks. This way, the bulk of the data can always remain in place, regardless of where it is being processed.

The benefits of leaving data in place are substantial. Data movement is minimized, reducing infrastructure, administrative, network bandwidth, and potential egress costs (when public clouds are in use). The time to move through the stages of the AI data pipeline to get to better business decisions is shorter. And the data security and/or governance implications of moving the data are minimized or entirely avoided.

Outlook

IDC expects that within the next few years, AI will start to permeate business processes for most enterprises. In general, more data will drive better products and services, improved customer experience, and more relevant business insights. There will be a proliferation of data capture points as enterprises glean data from edge devices, their own products and services, employees, supply chain partners, and customers.

All these data sets are generated by the thousands of operations that together constitute a business, but they themselves are not yet integrated or understood in relation to each other. What does a message from a sensor in a bottling machine in the United States mean for a soda drinks truck driver in Japan? What do social media messages about a U.S. airline’s political stance mean for the fares on a domestic route?

To deliver critical insights like these at scale requires end-to-end, real-time intelligence across an organization and its ecosystem. Data needs to stream freely, and where it naturally settles in a storage environment after having been leveraged for insights, it then needs to be joined by compute to perform more analysis, not the other way round. This is the kind of environment that businesses will start to build in the coming years.

White paper document thumnail

End-to-End AI Is Within Reach

Close to 70% of IT organizations will be modernizing their IT infrastructure over the next two years.