Cortex-A9 NEON™ Media Processing Engine

zoukankan html css js c++ java

Cortex-A9 NEON™ Media Processing Engine
目录
Cortex-A9 NEON™ Media Processing Engine
Introduction
VFPv3
Supported formats
Writing optimal VFP and Advanced SIMD code
Instruction timing tables
Cortex-A9 NEON™ Media Processing Engine

Introduction

The Cortex-A9 NEON MPE extends the Cortex-A9 functionality to provide support for the ARM v7 Advanced SIMD and Vector Floating-Point v3 (VFPv3) instruction sets. The Cortex-A9 NEON MPE supports all addressing modes and data-processing operations described in the ARM Architecture Reference Manual.

The Cortex-A9 NEON MPE features are:
- SIMD and scalar single-precision floating-point computation
- scalar double-precision floating-point computation
- SIMD and scalar half-precision floating-point conversion
- 8, 16, 32, and 64-bit signed and unsigned integer SIMD computation
- 8 or 16-bit polynomial computation for single-bit coefficients
- structured data load capabilities
- dual issue with Cortex-A9 processor ARM or Thumb instructions
- independent pipelines for VFPv3 and Advanced SIMD instructions
- large, shared register file, addressable as:
  — thirty-two 32-bit S (single) registers
  — thirty-two 64-bit D (double) registers
  — sixteen 128-bit Q (quad) registers.
The Cortex-A9 NEON MPE provides high-performance SIMD vector operations for:
- unsigned and signed integers
- single bit coefficient polynomials
- single-precision floating-point values.
The operations include:
- addition and subtraction
- multiplication with optional accumulation
- maximum or minimum value driven lane selection operations
- inverse square-root approximation
- comprehensive data-structure load instructions, including register-bank-resident table lookup.
VFPv3

The Cortex-A9 NEON MPE hardware supports single and double-precision add, subtract, multiply, divide, multiply and accumulate, and square root operations as described in the ARM VFPv3 architecture. It provides conversions between 16-bit, 32-bit and 64-bit floating-point formats and ARM integer word formats, with special operations to perform conversions in round-towards-zero mode for high-level language support.

ARMv7 deprecates the use of VFP vector mode. The Cortex-A9 NEON MPE hardware does not support VFP vector operations. In this manual, the term vector refers to Advanced SIMD integer, polynomial and single-precision vector operations. The Cortex-A9 NEON MPE provides high speed VFP operation without support code. However, if an application requires VFP vector operation, then it must use support code. See the ARM Architecture Reference Manual for information on VFP vector operation support.
此处提到的support code指的是：为VFP专有结构，对boot code（汇编代码）进行适应性的改造，以完成专有指令以及异常的处理。具体可以参考VFP Support Code

Supported formats

Table 2-1 shows the formats supported for each of the Advanced SIMD and VFPv3 instruction sets implemented by the Cortex-A9 NEON MPE. All signed integers are two's complement representations.

Writing optimal VFP and Advanced SIMD code

The following guidelines can provide significant performance increases for VFP and Advanced SIMD code:
Where possible avoid:
- unnecessary accesses to the VFP control registers
- transferring values between the Cortex-A9 core registers and VFP or Advanced SIMD register file, see the ARM Architecture Reference Manual for definition of core registers
- register dependencies between neighboring instructions
- mixing Advanced SIMD only instructions with VFP only instructions.
Be aware that:
- with the exception of simultaneous loads and stores, the processor can execute VFP and Advanced SIMD instructions in parallel with ARM or Thumb instructions
- using Advanced SIMD value selection operations is more efficient than using the equivalent VFP compare with conditional execution.
Instruction timing tables

内容较多，这里不列出了，具体参考《Cortex™-A9 NEON™ Media Processing
Engine Technical Reference Manual》
查看全文

相关阅读:
（转载）教你在PHP中使用全局变量
 （转载）遍历memcache中已缓存的key
（转载）PHP_Memcache函数详解
 PHP去除空白字符
 （转载）用PHP正则表达式清除字符串的空白
 （转载）PHP静态方法
 （转载）PHP 动态生成表格
 （转载）PHP strtotime函数详解
 （转载）URL与URI的区别
 ldap集成confluence

原文地址：https://www.cnblogs.com/batianhu/p/11059126.html

Cortex-A9 NEON™ Media Processing Engine

Cortex-A9 NEON™ Media Processing Engine

Introduction

VFPv3

Supported formats

Writing optimal VFP and Advanced SIMD code

Instruction timing tables