Cortex-R4 Processor

Cortex-R4 Processor Image (View Larger Cortex-R4 Processor Image)

The ARM® Cortex®-R4 processor is the first deeply embedded real-time processor based on the ARMv7-R architecture. It is for use in high-volume, deeply embedded System-on-Chip applications, for example, hard disk drive controllers, wireless baseband processors, consumer products and electronic control units for automotive systems.

The Cortex-R4 processor delivers substantially higher performance, real-time responsiveness, reliability and dependability with high error resistance, and offers more features than other processors in its class. This processor offers excellent energy efficiency and cost effectiveness for ASIC, ASSP and MCU embedded applications. The Cortex-R4 processor is very flexible and can be configured at synthesis time to optimize its feature set for a precise match with application requirements.

The Cortex-R series processors contribute to safe system design by enabling the processor to operate deterministically and not have its execution blocked for an unpredictable number of cycles by the external memory system or other bus masters; thereby enabling high availability.



The Cortex-R4 processor, designed for implementation on advanced silicon processes with an emphasis on improved energy efficiency, real-time responsiveness, advanced features and ease of system design. The processor provides a highly flexible and efficient two-cycle local memory interface, enabling SoC designers to minimize system cost and energy consumption.

Summary of Cortex-R4 Key Features

  • Fast – high performance 
    • Power-efficient, 8-stage dual issue pipeline with instruction pre-fetch and branch prediction
    • ARMv7R architecture - Thumb-2 / ARM instructions
    • Hardware divide, SIMD, DSP, SP/DP FPU option
    • Harvard I + D caches, 64-bit AMBA AXI-3
  • Deterministic – fast interrupt response
    • Vectored Interrupt Controller port
    • Low Latency Interrupt Mode (LLIM) to accelerate interrupt entry whenever possible without waiting for the current instruction or memory access to complete
    • Tightly-Coupled Memory system which provides a second level-1 memory besides the cache for storing critical code and data such as interrupt service routines which can then be immediately executed without waiting for cache evictions and fetches from main memory 
  • Reliable – error handling built into core
    • Memory Protection Unit
    • ECC and Parity protection on L1 memories
    • Dual core lock-step configuration
  • Cost-effective and low cost of ownership

ARM Cortex-R4 Processor



Micro-architecture Eight-stage pipeline with instruction pre-fetch, branch prediction and selected dual-issue execution. Parallel execution paths for load-store, MAC, shift-ALU, divide and floating point. 1.66 Dhrystone MIPS/MHz. Hardware divider. Binary compatibility with classic ARM9 and ARM11 embedded processor famlies.
Instruction Set ARMv7-R architecture with Thumb-2 and thumb. DSP extensions. Optional floating-point unit.
Cache controllers Harvard memory architecture with optional integrated Instruction and Data cache controllers. Cache sizes configurable from 4 to 64 KB. Cache lines are either write-back or write-through.
Tightly-Coupled Memories Optional Tightly-Coupled Memory interfaces. Use TCMs for highly deterministic or low-latency applications that may not respond well to caching, e.g. instruction code for interrupt service routines and data that requires intense processing. One or two logical TCMs, A and B, can be used for any mix of code and data. TCM size can be up to 8 MB. TCM B has two physical ports, B0 and B1, for interleaving incoming DMA data streams.
Interrupt interface Standard interrupt, IRQ, and non-maskable fast interrupt, FIQ and inputs are provided together with a VIC interrupt controller vector port. The GIC interrupt controller can also be used if more complex priority-based interrupt handling is required. The processor includes low-latency interrupt technology that allows long multi-cycle instructions to interrupted and restarted. Lengthy memory accesses are also deferred in certain circumstances. Worst-case interrupt response can be as low as 20-cycles using the FIQ alone.
Memory Protection Unit Optional MPU configures attributes for either eight or twelve regions, each with resolution down to 32 Bytes. Regions can overlap, and the highest numbered region has highest priority.
Floating Point Unit Optional Floating Point Unit (FPU) implements the ARM Vector Floating Point architecture VFPv3 with 16 double-precision registers, compliant with IEEE754. The FPU performance, optimized for single precision calculations, has full support for double precision. Operations include add, subtract, multiply, divide, multiply and accumulate, square root, conversions between fixed and floating-point, and floating-point constant instructions.
ECC Optional single-bit error correction and two-bit error detection for cache and/or TCM memories with ECC bits. Single-bit soft errors are automatically corrected by the processor.
Parity Optional support for parity bit error detection in caches and/or TCMs.
Master AXI bus 64-bit AMBA AXI bus master for Level-2 memory and peripheral access.
Slave AXI bus Optional 64-bit AMBA AXI bus slave port allows DMA masters to access the dual-port TCM B interface for high speed streaming of data in and out of the processor.
Debug Debug Access Port is provided. Its functionality can be extended with DK-R4.
Trace An interface suitable for connection to CoreSight Embedded Trace Module is present.
Dual core A dual processor configuration implements a redundant Cortex-R4 CPU in lock step with offset clocks and comparison logic for fault tolerant/fault detecting dependable systems.
Configuration Synthesizable Verilog RTL with facility to configure options for synthesis.

Cortex-R4 Performance Power and Area

Processor area, frequency and power consumption are highly dependent on process, libraries and optimizations. The table below estimates a typical single processor implementation of the Cortex-R4 processor on mainstream low power process technology (40nm LP) with high-density, standard-performance cell libraries and 32KB instruction cache and 32KB data cache.

Cortex-R4 Single Processor 40nm LP
Maximum clock frequency Above 800MHz 
Performance 1.68 / 2.03 / 2.45 DMIPS/MHz
3.47 CoreMark/MHz
Total area (Including Core+RAM+Routing) From 0.45 mm2
Efficiency From 37 DMIPS/mW 

* The first result abides by all of the 'ground rules' laid out in the Dhrystone documentation, the second permits inlining of functions (not just the permitted C string libraries) while the third additionally permits simultaneous multifile complilation. All are with the original (K&R) v2.1 of Dhrystone.

Use ARM System IP, Development Tools and Physical IP to implement complete Cortex-R4 systems.

CoreLink™ and CoreSight™ System IP

NIC-400 Configurable hierarchic low latency interconnect for AMBA 3 AXI, AHB-Lite and APB components. Configurations can range from a single bridge component, such as an AHB to AXI protocol bridge, to a large infrastructure of 128 masters and 64 slaves in combinations of different AMBA protocols.
QOS-400 Added to NIC-400 to minimize average latency and guarantee worst-case latency and bandwidth of critical interfaces such as DDR memory.
DMC-34x Dynamic memory controllers providing highly efficient interfaces to DRAM by leveraging AXI interconnect features to optimize memory request scheduling and using built-in Quality of Service controls to manage the initiator's latency and bandwidth requirements. Memory types supported include SDR, DDR, LPDDR (Mobile DDR), eDRAM, DDR2 and LPDDR2 (Mobile DDR2).
SMC-35x Static memory controllers interface AXI interconnects to a range of non-volatile memories with highly configurable parameters. Memory types supported include SRAM, NAND Flash and NOR Flash.
L2C-310 Level-2 cache controller designed to boost performance while reducing overall traffic to system memory and therefore SoC energy consumption. Reducing demands on off-chip memory bandwidth frees up resources for other masters.
DMA-330 A highly flexible micro-programmable Direct Memory Access controller for high-end high-performance energy-efficient AXI-based processing systems.
PL192 An AMBA AHB advanced Vectored Interrupt Controller (VIC) supporting up to 32 vectored interrupts with programmable priority level and masking.
GIC400 An AMBA AHB and AXI scalable, configurable, low gate count Interrupt Controller that stores vector addresses in memory. Options include multi-processor and TrustZone support.
ETM-R4 The Embedded Trace Macrocell provides real-time instruction and data trace and is configured to capture information before and after a specified sequence of events with the processor at full speed.
DK-R4 A complete Debug Kit including ETM-R4 and a fully featured Debug Access Port (DAP) to complement the DAP-Lite shipped with every Cortex-R4. DK components include DAP, cross trigger, ETM, AMBA bus trace, serial wire debug, trace funnel, trace buffer, trace port interface and serial wire viewer.

Development Tools for Cortex-R4

ARM DS-5 Development Studio, as well as a wide range of third party tools, operating systems and EDA flows fully support all Cortex-R processors. ARM DS-5 software development tools are unique in their ability to provide solutions that take full advantage of the complete ARM technology portfolio. Tools specific to Cortex-R4 are:

ARM DS-5 ARM Compiler 5.0 with Thumb-2 optimized for Cortex-R4.
Fast Models With ARM Fast Models, software development can begin prior to silicon availability. These extensively validated programmer’s view models provide access to ARM-based systems suitable for early software development.
Versatile Express The Versatile™ Express family of development platforms provides the right environment for prototyping the next generation of system-on-chip designs.
Soft Macrocell Model A Soft Macrocell Model (SMM) is an FPGA implementation of an ARM processor, built with ARM development boards

Physical IP

ARM optimized Physical IP platforms for best-in-class implementations of Cortex-R4 on leading semiconductor process technologies.

Standard cell logic libraries Available in a variety of different architectures, ARM Standard Cell Libraries support a wide performance range for all types of SoC designs. Designers can choose between different libraries and optimize their designs for speed, power and/or area.
Memory compilers and registers A broad array of silicon proven SRAM, Register File and ROM memory compilers for all types of SoC designs ranging from performance critical to cost sensitive and low power applications.
Interface IP A broad portfolio of silicon-proven Interface IP designed to meet varying system architectures and standards. General Purpose I/O, Specialty I/O, High Speed DDR and Serial Interfaces, optimized to deliver high data throughput performance with low pin counts.

ARM Connected Community

Cortex-R4 related blogs, discussions, technical content

White Papers 

Cortex-R4 processor (348KB)

Cortex-R4 Powered Products
Go Left
Go Right



We use cookies to give you the best experience on our website. By continuing to use our site you consent to our cookies.

Change Settings

Find out more about the cookies we set