•Number of slow
memory references on unblocked matrix
multiply
• m = n^3 read each column of B n times
• + n^2 read each column of A once for each i
• + 2*n^2 read and write each element
of C once
• =
n^3 + 3*n^2
•So q = f/m = (2*n^3)/(n^3 +
3*n^2)
• ~= 2 for large n, no improvement over
matrix-vector multiply