This course aims to explain NEON SIMD instructions and C new operators to enable participants to vectorize, code and optimize DSP-like algorithms. They will get a detailed understanding of the instructions and CPU mechanisms that have an impact on the performance: cache, MMU and branch accelerators. Many labs, coded in C language, contribute to become familiar with this rich instruction set. Differences between AArch32 and AArch64 NEON implementations will be highlighted.
Duration & Attendance
- 2 days
- Min/max number of participants: 3-15
Engineers and technicians who are involved in algorithm vectorization, NEON coding and performance optimization.
|Day 1||Day 2|
|INTRODUCTION TO NEON / VFPv3 (2 hours)||CODING EXAMPLES (4 hours)|
|INSTRUCTION SET (5 hours)||PERFORMANCE OPTIMIZATION (3 hours)|
The detailed course program is available upon request. For on-site training, we can provide a customized program specifically tailored for your audience, needs, and schedule. Contact us to discuss this option.
Teaching Methods & Tools
Evaluation & Certification
Complementary Products & Services
PLDA has also developed an optimized FFT using NEON SIMD instructions