Leveraging Arm CPUs for Outstanding AI Inference

The AI landscape is rapidly evolving with new and diverse AI applications emerging every day, and the large-scale deployment of AI on the horizon. Arm CPUs are the foundation for AI everywhere and at the center of the most pervasive AI compute platform in the world. AI solutions in every segment – from cloud to the edge – thrive on our blend of high-performance, efficiency, security, and scalability. Arm CPUs are ideally placed to match evolving AI workload demands, from large language models (LLMs) powering generative AI, to smaller, domain-specific AI models. Arm CPUs enable popular toolchains in every sector and integrate with all major operating systems and AI frameworks to ensure frictionless development experiences and seamless acceleration.

Guide to Understanding CPU Inference

Which AI Workloads Run Best on the CPU?

Benefits of the Arm CPU for AI

Flexible, Power-Efficient Performance Without Security Compromise

Our power efficiency is the foundation for all our products, from microcontrollers to hyperscalers. Arm CPUs offer a compelling combination of power-efficient performance and flexibility, providing the foundation of choice for many AI workloads.

We have been refining our security architecture for over three decades, protecting billions of devices from common hacks and now AI running on the CPU benefits from these architectural security features, including the latest enhancements in Armv9.

Pervasive Global Platform Removes Fragmentation for Developers

Arm-based CPUs are at the heart of AI’s proliferation globally. Our pervasiveness enables Arm developers to adopt machine learning (ML) into applications at pace as models evolve.

Our investment in extensive software libraries and tools, alongside integration with all major operating systems and AI frameworks, helps ensure developers can optimize, develop, and deploy AI on Arm CPUs with ease. And we continue this commitment to our software ecosystem investment as AI workloads advance.

A Unique Path for Customized, Heterogeneous AI

For workloads that benefit from additional acceleration, Arm CPUs work seamlessly with GPUs and NPUs from Arm and our partners, offering a path to customization which brings the future of AI to life.

As our partners explore the next generation of AI compute requirements, Arm’s broad range of IP and compute subsystems (CSS) give them a path to craft custom systems with the flexibility needed to meet unique AI workload requirements.

Two Decades of AI Architecture Innovation

Arm has a focus on fast-paced architectural innovation that prepares our vast ecosystem for ever-changing compute requirements and the future of AI. Arm has consistently and proactively evolved the AI capabilities of our CPUs over two decades with features such as Neon, Helium, Scalable Vector Extension (SVE), Scalable Matrix Extension (SME), and others. The latest Armv9 architecture features are driving increased compute performance, alongside reduced power consumption for AI workloads.

Armv9 Architecture

Accelerating Workloads with the Leading AI Frameworks

Arm’s partnerships with the leading AI frameworks and operating systems helps ensure fast and easy deployment for scaling AI workloads across Arm CPUs. We support key partners with techniques for optimizing new models using quantization and open-source software for AI acceleration, such as Arm Kleidi, used by frameworks and independent software vendors. Targeting accelerations at the AI framework level helps drive AI accelerations on Arm CPUs most broadly, across billions of AI inference installs for workloads at the edge, on mobile, and in the cloud. Without any extra optimization effort, application developers can expect best performance for their AI workloads by default on Arm CPUs, thanks to our work across gaming, computer vision, and language models.

Discover Arm Kleidi

Leveraging Arm CPUs for Outstanding AI Inference

Guide to Understanding CPU Inference

Which AI Workloads Run Best on the CPU?

Traditional ML and Deep Learning

Generative AI

Benefits of the Arm CPU for AI

Flexible, Power-Efficient Performance Without Security Compromise

Pervasive Global Platform Removes Fragmentation for Developers

A Unique Path for Customized, Heterogeneous AI

Two Decades of AI Architecture Innovation

Accelerating Workloads with the Leading AI Frameworks

Latest CPU Inference News and Resources

Arm AI Readiness Index

Why Software is Crucial to Achieving AI’s Full Potential

World’s First Armv9 Edge AI Platform

Silicon Reimagined in the Age of AI

Scale Generative AI with Flexibility and Speed

Democratizing AI at the Edge with Arm and ExecuTorch

Scaling AI Inference with Meta’s New Llama 3.2 LLMs

Accelerate Popular Hugging Face Models with Arm Neoverse

Redefining Mobile Experiences with AI

PyTorch and ExecuTorch Integrations Deliver Performance Uplifts

Subscribe to the Latest AI News from Arm

Arm Account

Register for an account

Leveraging Arm CPUs for Outstanding AI Inference

Guide to Understanding CPU Inference

Which AI Workloads Run Best on the CPU?

Traditional ML and Deep Learning

Generative AI

Benefits of the Arm CPU for AI

Flexible, Power-Efficient Performance Without Security Compromise

Pervasive Global Platform Removes Fragmentation for Developers

A Unique Path for Customized, Heterogeneous AI

Two Decades of AI Architecture Innovation

Accelerating Workloads with the Leading AI Frameworks

Latest CPU Inference News and Resources

Arm AI Readiness Index

Why Software is Crucial to Achieving AI’s Full Potential

World’s First Armv9 Edge AI Platform

Silicon Reimagined in the Age of AI

Scale Generative AI with Flexibility and Speed

Democratizing AI at the Edge with Arm and ExecuTorch

Scaling AI Inference with Meta’s New Llama 3.2 LLMs

Accelerate Popular Hugging Face Models with Arm Neoverse

Redefining Mobile Experiences with AI

PyTorch and ExecuTorch Integrations Deliver Performance Uplifts

Subscribe to the Latest AI News from Arm