Sharma, R. (CSE) – Automatically Evolving GPU Libraries for Performance Portable AI Kernels

Virtual Event

GPUs are the workhorses of modern AI, widely deployed and developed by many vendors including Apple, Qualcomm, Intel, AMD, and NVIDIA. While these GPUs all offer high compute potential, programming them effectively is difficult because they differ in performance-critical features like SIMT width, cache capacity, and memory bandwidth, demanding different optimization strategies. Tunable kernels address […]

Last modified: Dec 11, 2025