Sharma, R. (CSE) – Automatically Evolving GPU Libraries for Performance Portable AI Kernels
Virtual Event
GPUs are the workhorses of modern AI, widely deployed and developed by many vendors including Apple, Qualcomm, Intel, AMD, and NVIDIA. While these GPUs all offer high compute potential, programming them effectively is difficult because they differ in performance-critical features like SIMT width, cache capacity, and memory bandwidth, demanding different optimization strategies. Tunable kernels address […]