Kleidi – Software-Level AI Acceleration

PyTorch

Arm works closely with the PyTorch community, helping to ensure models running on PyTorch just work on Arm—driving seamless acceleration for even the most demanding AI workloads.

ExecuTorch

Together, Arm and ExecuTorch, a lightweight ML framework, enable efficient on-device inference capabilities at the edge.

Llama.cpp

To demonstrate the capability of Arm-based CPUs for LLM inferencing, Arm and our partners are optimizing the int4 and int8 kernels implemented in llama.cpp to leverage these newer instructions.

Other Leading Frameworks

To maximize AI performance across the entire Arm compute platform, we are dedicated to optimizing inference workloads across all major AI and ML frameworks.

MediaPipe

Arm’s partnership with Google AI Edge on MediaPipe and XNNPACK is accelerating AI workloads on current and future Arm CPUs. This enables developers to deliver outstanding AI performance for mobile, web, edge and IoT, using numerous LLMs, like Gemma and Falcon.

Thanks to Kleidi integration with MediaPipe via XNNPACK, a 30% acceleration in TTFT has been achieved when running a chatbot demo on the Gemma 1 2B LLM on Arm-based premium smartphones.

Read Blog

Angel

Tencent’s Angel ML framework supports Hunyuan LLM, available in sizes from 1B to over 300B parameters. It enables AI capabilities across a wide range of devices, including smartphones and Windows on Arm PCs.

Our partnership was announced at the 2024 Tencent Global Digital Ecosystem Summit and is having a positive impact on real-world workloads by providing users with even more powerful and efficient on-device AI services across Tencent’s many applications.

Read WeChat Post

Key Developer Technologies for Accelerating CPU Performance

Arm Kleidi includes the latest developer enablement technologies designed to advance AI model capability, accuracy, and speed.

Unleashing CPU Performance at Scale

Kleidi enables easy optimization across the full range of Arm Neoverse and Arm Cortex-A CPUs. These technologies leverage advanced features in the Arm architecture, such as Arm Scalable Vector Extensions (SVE), and Arm Scalable Matrix Extensions (SME), which target accelerated AI performance.

CPU Inference

Seamless AI Acceleration for Developers Everywhere

Unprecedented AI on CPU Performance with Arm Kleidi

Collaborating with Key Partners Unlocks AI Acceleration Everywhere

PyTorch

BERT-Large

Llama 3.1 8B

RoBERTa

FunASR Paraformer-Large

ExecuTorch

Stable Audio Open

Llama 3.2 1B

Llama.cpp

Custom SLM

TinyLlama 1.1B

TinyStories

Llama 3.3 70B

Phi 3 3.8B

Llama 3 8B

Other Leading Frameworks

MNN

OpenCV

MediaPipe

Angel

Key Developer Technologies for Accelerating CPU Performance

Simplifying AI Deployment

Unleashing CPU Performance at Scale

Arm Kleidi Arrives in Automotive Markets

On-device Audio Generation 30x Faster with Stability AI

Enhanced Multimodal Experiences at the Edge

Extending Arm Kleidi to IoT

Arm KleidiCV Accelerates Computer Vision by 4x

Why Software is Crucial to Achieving AI’s Full Potential

Scale Generative AI with Flexibility and Speed

Accessible Generative AI in the Cloud with Arm and Meta

Guide to Understanding AI Inference on CPU

Subscribe to the Latest AI News from Arm

Arm Account

Register for an account

Seamless AI Acceleration for Developers Everywhere

Unprecedented AI on CPU Performance with Arm Kleidi

Collaborating with Key Partners Unlocks AI Acceleration Everywhere

PyTorch

BERT-Large

Llama 3.1 8B

RoBERTa

FunASR Paraformer-Large

ExecuTorch

Stable Audio Open

Llama 3.2 1B

Llama.cpp

Custom SLM

TinyLlama 1.1B

TinyStories

Llama 3.3 70B

Phi 3 3.8B

Llama 3 8B

Other Leading Frameworks

MNN

OpenCV

MediaPipe

Angel

Key Developer Technologies for Accelerating CPU Performance

Simplifying AI Deployment

Unleashing CPU Performance at Scale

Arm Kleidi Arrives in Automotive Markets

On-device Audio Generation 30x Faster with Stability AI

Enhanced Multimodal Experiences at the Edge

Extending Arm Kleidi to IoT

Arm KleidiCV Accelerates Computer Vision by 4x

Why Software is Crucial to Achieving AI’s Full Potential

Scale Generative AI with Flexibility and Speed

Accessible Generative AI in the Cloud with Arm and Meta

Guide to Understanding AI Inference on CPU

Subscribe to the Latest AI News from Arm