Leveraging Arm CPUs for Outstanding AI Inference
The AI landscape is rapidly evolving with new and diverse AI applications emerging every day, and the large-scale deployment of AI on the horizon. Arm CPUs are the foundation for AI everywhere and at the center of the most pervasive AI compute platform in the world. AI solutions in every segment – from cloud to the edge – thrive on our blend of high-performance, efficiency, security, and scalability. Arm CPUs are ideally placed to match evolving AI workload demands, from large language models (LLMs) powering generative AI, to smaller, domain-specific AI models. Arm CPUs enable popular toolchains in every sector and integrate with all major operating systems and AI frameworks to ensure frictionless development experiences and seamless acceleration.
Guide to Understanding CPU Inference
This comprehensive guide provides a deep dive into processing AI workloads on CPUs and the use cases for which this may be the practical choice. Explore the industries that are already benefiting from CPU inference and learn about real-world examples.
Which AI Workloads Run Best on the CPU?
Traditional ML and Deep Learning
Applications that combine machine learning (ML) and deep learning, such as cinematic photography, media and audio processing, automotive workloads (including digital cockpit and ADAS L2+), real-time voice assistants, real-time analytics and recommendations (for social media and e-commerce), and natural language processing (including sentiment analysis).
Generative AI
Generative AI use cases, including those that make use of small language models (SLMs) and LLMs, for example, conversation summarization, virtual assistants, customer UX (including agents and chat bots), content translation and summarization, plus content generation.
Benefits of the Arm CPU for AI
Flexible, Power-Efficient Performance Without Security Compromise
Our power efficiency is the foundation for all our products, from microcontrollers to hyperscalers. Arm CPUs offer a compelling combination of power-efficient performance and flexibility, providing the foundation of choice for many AI workloads.
We have been refining our security architecture for over three decades, protecting billions of devices from common hacks and now AI running on the CPU benefits from these architectural security features, including the latest enhancements in Armv9.
Pervasive Global Platform Removes Fragmentation for Developers
Arm-based CPUs are at the heart of AI’s proliferation globally. Our pervasiveness enables Arm developers to adopt machine learning (ML) into applications at pace as models evolve.
Our investment in extensive software libraries and tools, alongside integration with all major operating systems and AI frameworks, helps ensure developers can optimize, develop, and deploy AI on Arm CPUs with ease. And we continue this commitment to our software ecosystem investment as AI workloads advance.
A Unique Path for Customized, Heterogeneous AI
For workloads that benefit from additional acceleration, Arm CPUs work seamlessly with GPUs and NPUs from Arm and our partners, offering a path to customization which brings the future of AI to life.
As our partners explore the next generation of AI compute requirements, Arm’s broad range of IP and compute subsystems (CSS) give them a path to craft custom systems with the flexibility needed to meet unique AI workload requirements.
Two Decades of AI Architecture Innovation
Arm has a focus on fast-paced architectural innovation that prepares our vast ecosystem for ever-changing compute requirements and the future of AI. Arm has consistently and proactively evolved the AI capabilities of our CPUs over two decades with features such as Neon, Helium, Scalable Vector Extension (SVE), Scalable Matrix Extension (SME), and others. The latest Armv9 architecture features are driving increased compute performance, alongside reduced power consumption for AI workloads.
Accelerating Workloads with the Leading AI Frameworks
Arm’s partnerships with the leading AI frameworks and operating systems helps ensure fast and easy deployment for scaling AI workloads across Arm CPUs. We support key partners with techniques for optimizing new models using quantization and open-source software for AI acceleration, such as Arm Kleidi, used by frameworks and independent software vendors. Targeting accelerations at the AI framework level helps drive AI accelerations on Arm CPUs most broadly, across billions of AI inference installs for workloads at the edge, on mobile, and in the cloud. Without any extra optimization effort, application developers can expect best performance for their AI workloads by default on Arm CPUs, thanks to our work across gaming, computer vision, and language models.