•Expected Performance
–For Dyad ai= bi*ci or ai=bi+ci -- needs 2 load and 1 store i.e. 6 memory references to feed 2 FPUs -- only 4 are available:
• (2*77)*(4/6) = 102.7 MFLOP
–For linked triad
•ai= bi + s*ci
(2 load 1 store)
•(4*77)*(4/6) = 205.3 MFLOP
–For vector triad
•ai = bi + ci * di (3 load 1 store)
•(4*77)*(4/8)=154 MFLOPS