High Performance Computing 1
How efficient is the BLAS?
•
load/store
float ops
refs/ops
•
level 1
•
SAXPY
3N
2N
3/2
•
level 2
•
SGEMV
MN+N+2M
2MN
1/2
•
level 3
•
SGEMM
2MN+MK+KN
2MNK
2/N