GPU Kernel Engineer
CUDA EngineerRole:We’re looking for a GPU Kernel Engineer to work on high-performance compute infrastructure for advanced AI workloads. This role focuses on extracting maximum efficiency from modern GPU systems, improving communication and execution performance across distributed environments, and contributing to low-level optimization efforts within a fast-paced engineering team.Responsibilities:Write and optimize custom CUDA kernels from scratch for performance-critical workloadsDevelop and optimize GPU-accelerated compute pipelines for performance and scalabilityImprove inter-device communication efficiency and distributed execution workflowsAnalyze and resolve bottlenecks related to memory usage, latency, and throughputCollaborate closely with systems, compiler, and infrastructure engineers on performance-critical componentsExperience:2+ years of experience working with CUDA and GPU performance optimizationStrong understanding of GPU architecture, memory systems, and parallel execution modelsExperience with distributed or multi-GPU systems and synchronization conceptsFamiliarity with Triton or similar GPU programming frameworks is a plus