Files
kernel-lab/tasks/03_tiled_matmul/cuda_skeleton.cu
2026-04-10 13:15:06 +00:00

10 lines
329 B
Plaintext

// Workbook-local CUDA sketch for tiled matmul.
//
// TODO(student):
// 1. Choose a block tile size, for example 16x16 or 32x32.
// 2. Load one A tile and one B tile into shared memory.
// 3. Synchronize.
// 4. Accumulate partial products.
// 5. Synchronize before loading the next tile.
// 6. Store the final C element or tile.