Loop Unrolling And SIMD Operation

Technical approach Your task is to combine loop unrolling and SIMD operations with the FAXPY operation. Some notes: Do this lab at the C-level. The C-level solution is sufficient for full credit. Once you finish, if you want to challenge yourself, then do the lab at the x86-level. Note the EUs/execution engine of the Broadwell processor, which should limit the number of times you can effectively unroll the loop. This lab should use SSE/SSE2 operations, which further restrict the pipelines used by our FAXPY operation. Use only SSE/SSE2 SIMD operations, do not use AVX. No need to do run-time checking for SSE/SSE2. SSE/SSE2 are considered monolithic. In the run-time/compile-time checking lab we needed something to fall back on if AVX wasn't supported by the processor. Again, no need for that here. Do not forget the -O0 flag for gcc. Values of -O1 through -O3 may unroll the loop for you, or insert other optimizations, and we need to prevent unintentional optimizations. An older version of this lab manual had you do IAXPY, but doing this with SSE/SSE2 isn't straightforeward. Please do FAXPY instead

Loop Unrolling And SIMD Operation

Loop Unrolling And SIMD Operation is rated 4.8/5 based on 243 customer reviews.

Are you in need of homework help?
Place your order and get 100% original work.




Get Homework Help Now



Related Posts

Why Choose Us
  1. Confidentiality and Privacy
  2. 100% Original Work
  3. 24/7 Customer Support
  4. Unlimited Free Revisions
  5. Experienced Writers
  6. Real-time Communication
  7. Affordable Prices
  8. Deadline Guaranteed