*
* Home | Chinese | Japanese * About ARM | Forums | Events | News | Employment | Contact Us | Investors *
dotted rule
* ARM - the architecture for the digital worldARM - the architecture for the digital world
search
*
*
* * *
* MARKETS : PRODUCTS & SOLUTIONS : CONNECTED COMMUNITY : TECHNICAL SUPPORT : DOCUMENTATION *
*
products and solutions
*
*
* * * *
* . Products & Solutions
*
*
  >> Home Page  
*
  . ARM Services  
*
  . RealView Development Tools  
*
  . Fabric IP  
*
  . On-chip Debug & Trace  
*
  . Multimedia  
*
  . Physical IP  
*
  . Processors  
*
    Processor Overview  
*
    Processor Selector  
*
  . Processor Families  
*
     
.
     
.
     
.
     
.
     
.
     
.
    Cortex  
.
    Processor Architecture  
*
    Reference Methodology  
*
    Performance Packages  
*
    Application Processors  
*
    Embedded Processors  
*
*
  . Security Solutions  
*
  . Operating System Support  
*
  . Licensing  
*
  >> Markets  
*
  >> Books  
*
*
*

ARM Cortex-M3

ask ARM *
*

The ARM Cortex™-M3 32-bit processor has been specifically developed to provide a high-performance, low-cost platform for a broad range of applications  including  microcontrollers, automotive body systems, industrial control systems and wireless networking. The Cortex-M3 processor provides outstanding computational performance and exceptional system response to interrupts while meeting low cost requirements through small core footprint, industry leading code density enabling smaller memories, reduced pin count and low power consumption.

The central core of the Cortex-M3 processor, based on a 3-stage pipeline Harvard bus architecture, incorporates advanced features including single cycle multiply and  hardware divide to deliver an outstanding efficiency of 1.25 DMIPS/MHz. The Cortex-M3 processor also implements the new Thumb®-2 instruction set architecture, which when combined with features such as unaligned data storage and atomic bit manipulation delivers 32-bit performance at a cost equivalent to modern 8- and 16-bit devices.

Cortex-M3
View larger image

Applications

The Cortex-M3 processor offers an excellent balance of architectural features, high performance and low costs, making it a very attractive choice for a broad range of applications, including: 

  • Microcontrollers
    • 32-bit performance at 8-bit costs
  • Wireless networking (inc Bluetooth, ZigBee and others)
    • Low power operation and integrated sleep modes supporting complex stacks
  • Automotive and industrial control systems
    • Secure, reliable and deterministic operation
  • White goods
    • High performance maths for complex motor algorithm support
  • Electronic toys
    • Low cost implementations for next generation intelligent toys
  • Medical instrumentation
    • High reliability core and tools enabling IEC61508 and FDA approval.

Features

  • ARMv7-M architecture
    • Optimized for microcontroller and low-cost applications
  • Thumb-2 instruction set
    • Enhanced levels of performance, energy efficiency, and code density
    • Mixed mode capability implies no need to interwork between modes
    • ARM levels of performance with Thumb level code density
  • Hierarchical structure with tightly integrated peripherals
    • CM3Core 
      • Harvard bus architecture – separate instruction and data buses
      • Highly efficient 3-stage pipeline with branch speculation
      • Nested Vectored Interrupt Controller (NVIC)
      • Gate efficient stack-based register model
      • Configurable from 1-240 physical interrupts; up to 256 levels of priority
      • Non-Maskable Interrupt (NMI) enables critical interrupt capabilities
      • Low latency through tail chaining, late arrival service & stack pop pre-emption
      • Nesting (stacking) of interrupts
      • Dynamic interrupt reprioritization
    • Memory Protection Unit (MPU)
      • Optional component for separation of processing tasks and data protection
      • Up to 8 regions of protection; each of which can be divided into 8 sub-regions
      • Region sizes between 32 bytes to the entire 4 gigabytes of addressable memory
    • Embedded Trace Macrocell (ETM)
      • Optional component for real-time instruction trace
    • Data Watchpoint and Trace unit (DWT)
      • Implements hardware breakpoints and provides instruction execution statistics
    • Flash Patch and Breakpoint unit (FPB)
      • Implements 6 program breakpoints and 2 literal data fetch breakpoints
    • Debug Port ( SW-DP or SWJ-DP )
      • Configurable debug access through Serial Wire or JTAG interface
  • Single cycle multiply and hardware divide instructions
    • 32-bit multiplication in a single cycle
    • Signed and unsigned divide operations between 2 and 12 cycles
  • Preconfigured memory map
  • Up to 4 gigabytes of addressable memory space
  • Predefined addresses for code, memory, external devices, peripherals
  • Dedicated space for vendor specific addressability
  • Atomic bit manipulation with bit banding
    • Direct access to single bits of data
    • Two 1MB bit banding regions for memory and peripherals mapping to 32MB alias regions
    • Atomic operation, cannot be interrupted by other bus activities
  • Unaligned data storage and access
    • Continuous storage of data requiring different byte lengths
    • Data access in a single core access cycle
  • Integrated sleep modes
    • Sleep Now mode for immediate transfer to low power state
    • Sleep on Exit mode for entry into low power state after the servicing of an interrupt
    • Ability to extend power savings to other system components


More details on the CM3Core

More details on the NVIC

Benefits

  • High performance
    • 1.25 DMIPS/MHz on the Dhrystone 2.1 Benchmark
    • 70% more efficient per MHz vs. the ARM7TDMI-S processor executing Thumb instructions
    • 35% more efficient per MHz vs. the ARM7TDMI-S executing ARM instructions
    • Highly deterministic, low latency interrupt handling
    • Excellent data manipulation capabilities via Thumb-2 Bit Field Instructions
  • Low manufacturing costs
    • Low gate count implementations
      • 33K gates Central Core (CM3Core)
      • 60K gates or lower for complete standard implementation
      • Additional gate count reductions available through synthesis
      • All numbers for TSMC 0.18um G process, 50MHz target frequency
    • Smaller memory requirements
      • Up to 45% smaller code size vs. the ARM7TDMI-S executing ARM instructions
      • Up to 10% smaller code size vs. the ARM7TDMI-S executing Thumb instructions
    • Reduced pin count for lower packaging costs
      • Serial Wire Debug implements debug with just 2 pins
      • Single Wire Viewer implements single pin trace profiling
  • Enhanced energy efficiency
    • Clock gating, integrated sleep modes reduce power at no loss of performance
    • Power as low as 0.085 mW/MHz on the TSMC 0.13G process
  • Faster time to market with ease of use – system design
    • Fully synthesisable design
    • NVIC configurable to 1-240 physical interrupts with up to 256 levels of priority
    • Optional ETM can add trace capabilities
    • Optional MPU can add memory protection
    • Integrated debug/trace facilitate quicker debug
  • Faster time to market with ease of use – software development
    • Supported by the ARM RealView Microcontroller Development Kit
      Combines the advantages of industry standard RealView compilation tools and sophisticated debugging support via the industry’s leading Keil μVision® microcontroller development environment.
    • Simplified stack-based programmer’s model ; simple vector based interrupt scheme
    • Thumb-2 removes need for interworking required by ARM/Thumb instructions
    • Native bitfield manipulation, hardware division and If/Then instructions
    • Thumb-2 is backwards compatible with existing ARM and Thumb solutions
    • Thumb-2 is compatible with other members of the Cortex family
    • The processor implements the stack manipulation in hardware
    • Hence assembler wrappers for handling stack manipulation for interrupt service routines are not necessary
    • NVIC integrates a systemtick timer that can provide an ideal heartbeat for a RealTime OS
  • Excellent 32-bit migration choice for 8/16 bit architecture based designs
    • Simplified stack-based programmer’s model is compatibile with traditional ARM architecture and retains the programming simplicity of legacy 8 and 16-bit architecture

Software Standard
The Cortex-M0 processor is fully compatible with the recently-launched Cortex Microcontroller Software Interface Standard (CMSIS), the vendor-independent hardware abstraction layer for the Cortex-M processor series

The CMSIS is available for free download from www.onARM.com, a website providing a comprehensive resource for embedded developers. CMSIS documentation and maintenance of the software layer will be provided by ARM.

Comparison of Cortex-M3 processor with ARM7TDMI® processor

The Cortex-M3 processor offers enhanced features and performance and an easy migration path to present a logical upgrade for ARM7TDMI® processor-based designs desiring to meet the challenges of next generation technologies. The central core offers higher efficiency; a simpler programming model and excellent deterministic interrupt behaviour, whilst the integrated peripherals offer enhanced performance at lower cost and power consumption.

 Features   ARM7TDMI  ARM Cortex-M3
 Architecture  ARMv4T (von Neumann  ARMv7-M (Harvard)
 ISA Support  Thumb / ARM  Thumb / Thumb-2
 Pipeline  3-stage  3-stage + branch speculation
 Interrupts  FIQ / IRQ  NMI +1 to 240 physical interrupts
 Interrupt Latency  24 - 42 cycles 12 cycles
Inter-Interrupt Latency 24 cycles  6 cycles
 Sleep Modes  None  Integrated
 Memory Protection  None  8 region MPU
 Dhrystone

0.95 DMIPS/MHz (ARM)
0.74 DMIPS/MHz (Thumb)

 1.25 DMIPS/MHz
 Power Consumption  0.28mW/MHz  0.19mW/MHz
 Area  0.62mm2 (Core only)  0.86mm2 (core + peripherals)*

Performance characteristics qouted for a 100MHz target implementation on the TSMC 0.18G process
* Does not include optional system peripherals (MPU & ETM) or integration level components

Running ARM7 software on the Cortex-M3 Processor

Whilst the Cortex-M3 processor has many architectural innovations plus superior performance, power consumption and code size versus the ARM7 family of processor substantial amounts of code will work on the Cortex-M3 processor without modification. If the application uses an RTOS which is available for Cortex-M3 processor or is written almost exclusively in C, the migration can be completed through minimal re-compilation effort. If the application is written in assembly and uses only Thumb instructions, the code will work seamlessly on the Cortex-M3 processor. If ARM instructions have been used in the assembly, the Unified Assembler Framework can efficiently convert them into equivalent Thumb-2 instructions. For more details, refer to the “Running ARM7TDMI software on Cortex-M3” white paper (see below).

Related Links:

 

 

 

Performance Characteristics       Top Right Corner
*
* 0.18 0.13 90 nm
*
*     Speed
Opt
Area
Opt
Speed
Opt
Area
Opt
Speed
Opt
Area
Opt
*                
* Standard Cells   Metro Metro SAGE-X Metro Advantage Advantage
*
*
*                
*
*
* Frequency* (MHz)   100 50 135 50 191 50
*
*
* CM3Core Area (mm²)   0.43 0.35 0.43 0.21 0.21 0.13
*
*
* CM3Core Power † (mW/MHz)   0.31 0.21 0.14 0.07 0.07 0.04
*
*
* Area (mm²)   0.78 0.64 0.74 0.38 0.37 0.25
*
*
* Power (mW/MHz)   0.37 0.25 0.165 0.084 0.083 0.047
*
*

Core area, frequency range and power consumption are dependent on process, libraries and optimizations. The numbers quoted above are illustrative of synthesized cores using general purpose TSMC process technologies and ARM Artisan standard cell libraries and RAMs. Area numbers include the CM3Core, the Nested Vectored Interrupt Controller(NVIC) and Bus Matrix but not the optional components including the Memory Protection Unit, Embedded Trace Macrocell, Breakpoint Unit, Data Watchpoint Unit and Trace Port Interface Unit.

The speed optimized implementations refer to the library choices and synthesis flow decisions and tradeoffs made in order to achieve the target frequency performance. The area optimized implementations refer to the library choices and synthesis flow decisions and tradeoffs made in order to achieve a target area density.

*Worst case conditions –   0.18µm process - 1.62V, 125C, slow silicon ;  0.13µm process - 1.08V, 125C, slow silicon
† Typical case conditions– 0.18µm process–1.8V, 25C, typical silicon ;  0.13µm process - 1.2V, 25C, typical silicon

*

* *
* 4 dots * Other ARM Websites
*
shadow *LEGAL STATEMENT shadow