In recent years, there has been a significant uptick in demand for artificial intelligence (AI) workloads, propelling companies to enhance their AI infrastructure offerings. As a prime example of this trend, NVIDIA CEO Jensen Huang has introduced the concept of AI factories, which are essentially advanced data centers specifically designed to handle AI processes. These modern AI factories encapsulate the necessary compute, storage, and networking infrastructure that businesses need to fully operationalize AI developments and production. Jensen Huang first introduced the term “AI factory” to expand NVIDIA’s market perception beyond GPU manufacturing, rebranding it as a “factory for AI” itself, as noted by Rohit Tandon, managing director of AI and Insights at Deloitte.
During the GTC 2022 keynote, Huang elaborated further on the concept by emphasizing that AI data centers are designed to continuously process and refine vast amounts of data. This is essential for training and fine-tuning AI models. He likened the process within AI factories to a traditional factory setting where raw materials are refined to produce finished products. Companies utilizing these AI factories are essentially producing “manufactured intelligence” at scale. Tandon offers an analogy to simplify the understanding of AI factories. He compares them to a race car engine, with the supporting software and user interfaces acting as the tires and chassis, ensuring that the engine operates at its highest potential.
The Evolution of AI Factories
Michael Dell, during the 2024 Dell Technologies World conference in Las Vegas, took the analogy further by comparing the evolution of AI factories to the historical evolution of factories. Early factories initially harnessed mechanical power from waterwheels and wind before transitioning to direct use of electricity for more efficient production processes. Dell pointed out that in the development and application of AI, bypassing intermediary steps enables businesses to transition directly to efficient, large-scale AI production. This streamlined approach is seen as advantageous in speeding up the deployment and operationalization of AI technologies.
AI factories are typically offered as subscription services, where costs are based on compute and data usage. This often includes an additional consulting fee. The AI Factory as a Service (AI-FaaS) model is comprehensive, providing organizations with all the essentials to execute their AI workloads efficiently. This model includes managing AI infrastructure, deploying models, handling orchestration services, and delivering the necessary consulting expertise to ensure smooth operations. By packaging these services, businesses can leverage advanced AI technologies without the substantial upfront investments typically associated with building in-house infrastructure.
Dell Technologies’ Role in AI Infrastructure
Dell Technologies has been making significant strides in fortifying AI infrastructure to meet the escalating demands of AI workloads. Recently, the company expanded its AI Factory to support AMD environments, alongside launching new server models, such as the PowerEdge XE7745. This model is specifically designed for AI inferencing and model tuning purposes. Additionally, Dell’s PowerEdge R6725a and R7725 servers are engineered to handle robust data analytics and AI workloads, offering scalable configurations that can be tailored to specific organizational needs. The expansion of Dell’s AI infrastructure aligns with its broader strategy to address the increasing demands for AI processing power and efficiency.
A significant addition to Dell’s portfolio came on October 15 with the introduction of the PowerEdge XE9712 platform. This innovative platform integrates NVIDIA Grace and Blackwell GPUs, providing the high-performance capabilities required for training large language models (LLMs) and deploying real-time AI inferencing tasks. These advancements are crucial for organizations striving to stay competitive in the ever-evolving AI landscape. The PowerEdge XE9712’s modular, flexible, and efficient design makes it well-suited for modern data centers, which require scalability and robust performance to manage complex AI-driven applications effectively.
Hewlett Packard Enterprise’s AI Factory Initiative
Parallel to Dell’s advancements, Hewlett Packard Enterprise (HPE) has made a notable entry into the AI infrastructure space with its NVIDIA-powered AI factory solution called Private Cloud AI. This cutting-edge solution leverages NVIDIA microservices to deliver an integrated AI infrastructure, streamlining the path from development to deployment of AI capabilities. By harnessing the power of NVIDIA technology, HPE aims to provide businesses with a highly efficient and cohesive AI environment that can be customized to specific needs and scaled as required.
Adding to the dynamism of the AI factory landscape, Deloitte and NVIDIA have collaborated to introduce AI Factory as a Service, a turnkey generative AI solution announced in September. This service, designed by Rohit Tandon and Nitin Mittal, focuses on simplifying the deployment of AI infrastructure. By combining NVIDIA’s comprehensive hardware and software solutions with Oracle’s robust software platforms, Deloitte aims to offer a seamless, end-to-end AI solution. The managed services provided include the full gamut of AI infrastructure deployment, orchestration, and a consulting layer to aid businesses in effectively integrating AI technologies into their operations.
Deloitte’s Comprehensive AI Factory as a Service
Deloitte’s AI Factory as a Service integrates various components, including NVIDIA’s AI Enterprise, NIM Agent Blueprints, and Oracle’s extensive AI technology stack. This robust framework allows organizations to leverage NVIDIA’s NeMo framework to accelerate the development and deployment of generative AI applications, taking advantage of Oracle’s flexible IaaS, PaaS, and database services for either custom-built or pre-existing applications. This integrated approach is designed to make AI infrastructure deployment more cost-effective and straightforward, reducing setup times from what could traditionally be up to three months to as little as four weeks.
Within this flexible model, organizations can adjust their AI infrastructure usage to be more cost-effective, focusing resources where needed and scaling operations as required. Tandon explains that this flexibility allows clients to be more efficient and cost-effective in delivering AI solutions by utilizing the modular and scalable nature of the AI Factory as a Service model. This approach minimizes upfront investments and delivers a scalable and adaptable AI infrastructure solution that meets the evolving needs of businesses.
NVIDIA’s Enterprise Reference Architectures
In recent years, there’s been a notable surge in the need for artificial intelligence (AI) workloads. This increase is pushing companies to improve their AI infrastructure options. A key figure in this development is NVIDIA CEO Jensen Huang, who introduced the idea of AI factories. These are advanced data centers specially built to handle AI tasks, encompassing essential compute, storage, and networking components required to operationalize AI projects. Huang’s term “AI factory” represents an effort to rebrand NVIDIA from just a GPU manufacturer to a leader in AI infrastructure, as highlighted by Rohit Tandon, managing director of AI and Insights at Deloitte.
In his GTC 2022 keynote, Huang expanded on the concept of AI data centers, stressing their role in continuously processing and refining vast amounts of data crucial for training AI models. He compared AI factories to traditional factories where raw materials are transformed into finished products. Companies using these AI factories are essentially mass-producing “manufactured intelligence.” Tandon simplifies the concept by comparing AI factories to a race car engine, with software and user interfaces acting as the tires and chassis, ensuring optimal performance from the engine.