With enterprise AI spending expected to exceed $1 trillion by 2029, driven largely by agentic AI applications, organizations are looking to shift their strategies toward high-density agentic workflows and address the resulting demands on AI inference and infrastructure. To help organizations keep pace, Red Hat AI Factory with NVIDIA enables IT operations teams to streamline the management of both traditional infrastructure and the evolving demands of the AI ​​stack. Red Hat AI Factory with NVIDIA accelerates the path to production AI and delivers the software platform for AI factories, running on accelerated computing infrastructure that powers higher performance for the NVIDIA models and GPUs that drive the inference stack. The platform supports the AI ​​factory infrastructure of leading systems manufacturers, including Cisco, Dell Technologies, Lenovo, and Supermicro. This allows IT administrators and operations teams to scale and maintain AI deployments with the same operational rigor and predictability as any enterprise workload. This co-designed software platform integrates the open source collaboration, engineering, and support expertise of both Red Hat and NVIDIA to deliver a reliable enterprise solution.

Red Hat AI Factory with NVIDIA provides a highly scalable foundation for AI deployments in any environment, whether on-premises, in the cloud, or at the edge. It includes essential capabilities for high-performance AI inference, model tuning, customization, and agent deployment and management, with a focus on security.

This allows organizations to maintain architectural control from the data center to the public cloud, which translates into:

● Shorter time to value: This solution facilitates the move to production AI with streamlined workflows and instant access to pre-configured models, including the legally backed IBM Granite family, NVIDIA Nemotron, and open NVIDIA Cosmos models, delivered as NVIDIA NIM microservices. Furthermore, it enables organizations to better align models with business data using NVIDIA NeMo, reducing tuning time and costs.

● Optimized performance and cost: Optimize infrastructure utilization and boost inference performance with a unified, high-performance service stack. Red Hat AI Factory with NVIDIA delivers integrated observability capabilities and leverages Red Hat AI inference capabilities powered by vLLM, NVIDIA TensorRT-LLM, and NVIDIA Dynamo to meet stringent AI service-level objectives. This helps organizations reduce the total cost of ownership (TCO) for AI by optimizing the connection between models and NVIDIA GPUs.

● Intelligent GPU Orchestration: Enables on-demand access to GPU resources through intelligent orchestration and pooled infrastructure, with automatic checkpoints to protect long-running jobs and maintain more predictable compute costs in dynamic environments.

● Enhanced Enterprise Security: Leveraging the flexible and stable foundation of Red Hat Enterprise Linux, organizations benefit from advanced security and compliance capabilities built in from the ground up, helping to reduce risk, save time, and mitigate downtime. This provides a robust security foundation for mission-critical AI workloads that require isolation and continuous verification. NVIDIA DOCA microservices build upon this foundation, creating a zero-trust architecture and delivering AI runtime security across the entire infrastructure.