High Performance Computing 1
Static Load Balancing
•All information is available in advance
•Common cases:
–dense matrix algorithms, e.g. LU factorization
•done using blocked/cyclic layout
•blocked for locality, cyclic for load balancing
–usually a regular mesh, e.g., FFT
•done using cyclic+transpose+blocked layout for 1D
–sparse-matrix-vector multiplication
•use graph partitioning, where graph does not change over time
–