AI Libraries for Accelerating Any Framework on Arm
Arm Kleidi Libraries are a key component of Arm Kleidi, and provide a lightweight suite of highly performant open-source Arm routines. As a result, Kleidi makes it easy for any machine learning (ML) and computer vision (CV) framework to leverage the latest and future architecture features for seamlessly accelerating AI and CV in Arm Cortex-A and Arm Neoverse CPU-based designs. The libraries are designed for ease of adoption into C or C++ ML and AI frameworks and can achieve significant acceleration for the models that run on them.
Features and Benefits
Arm Kleidi provides a flexible assortment of kernels for enhancing AI on frameworks. It offers broad scope for multifaceted AI advancement on Arm—from enabling more capability or accuracy for AI, to achieving accelerations or reducing memory overhead.
The new Arm KleidiAI and Arm KleidiCV performance libraries are incredibly lightweight and concise. They carry no library dependencies or binary release and avoid duplication of memory allocation or multithreading implementation in the framework. This makes them easy to adopt and integrate into existing framework codebases quickly and efficiently.
As Kleidi helps optimize AI at the framework level, each optimization can benefit hundreds of workloads across billions of Arm-based devices. Application developers simply run models on Kleidi-optimized frameworks to achieve top performance by default.
Kleidi helps maximize the ease and speed with which the most demanding AI inference workloads can be deployed on Arm. The KleidiAI library helps bring best-in-class performance to the exploding market of generative AI and large language models (LLMs), deployed from cloud datacenters to constrained devices at the edge.
Kleidi helps easy optimization from cloud to edge across the full range of Arm Neoverse and Arm Cortex-A CPUs. The performance libraries leverage specific technologies for enhancing AI functions in the Arm architecture, such as Arm Neon, Arm Scalable Vector Extensions (SVE), and Arm Scalable Matrix Extensions (SME).
Arm KleidiAI library is directly integrated into key AI frameworks, including MediaPipe (via XNNPACK), llama.cpp, PyTorch (via ATen), and Hunyuan.
Once integrated, developers automatically benefit from performance enhancements for the Kleidi-optimized frameworks without any direct overhead for them.
Get Started with Arm Kleidi Libraries
Access the available software within our growing suite.
Performance Library for All AI Frameworks
Performance Library for Computer Vision Frameworks
AI Libraries for Advancing Inference Everywhere on Arm CPU
Generative AI
KleidiAI enables optimal performance for some of the world’s most advanced language models on Arm Cortex-A CPUs.
The KleidiAI library has already demonstrated accelerated performance for Llama, Meta’s advanced open-source LLM, and Phi, Microsoft’s highly capable small language model (SLM), by up to 190% based on framework optimizations.
Computer Vision
Alongside emerging AI use cases, Arm Kleidi also benefits traditional computer vision use cases. An example of this is OpenCV, the world’s largest computer vision library containing over 2,500 algorithms and supporting hundreds of thousands of developers.
After running a variety of image processing operations based on KleidiCV integrations, OpenCV identified a typical performance uplift of 75%.
AI in Gaming
Unity Sentis empowers game developers to create innovative, AI-driven gameplay experiences on all Unity Engine-supported devices.
Collaboration with Arm has helped achieve quantization, which can reduce model size by up to 73% for Unity developers building with Sentis. KleidiAI simplifies implementation and optimization on Arm architectures.
Talk with an Expert
If you have any questions about Kleidi libraries, talk to an Arm expert.
Arm Kleidi Library Resources
Webinar
- Empowering Developers: Tools and Resources for Running Generative AI on Arm CPUs
- Democratizing AI: Powering the Future with Arm’s Global Compute Ecosystem
- Accelerating LLM family of models on Arm Neoverse based Graviton AWS processors with KleidiAI
Community Blogs
- KleidiAI and ExecuTorch for Arm-based Mobile Devices
- Faster PyTorch Inference using Kleidi on Arm Neoverse
- Demoing LLM Inference with PyTorch on Arm-based AWS Graviton4 CPUs
- Arm KleidiAI: Helping AI frameworks elevate their performance on Arm CPUs
- Arm KleidiCV: Unleashing the power of Arm CPUs for image processing
Newsroom Blogs
- Democratizing AI at the Edge on Arm with ExecuTorch Beta Release
- Extending Developer Enablement Resources to New Frameworks, Accelerating LLMs From Cloud to Edge
- KleidiAI Integration Brings AI Performance Uplifts to MediaPipe
- Accelerating AI Developer Innovation Everywhere with New Arm Kleidi
Resources
Software and Tools