Upvote
Downvote
AI Inference Engineer
Share Job
- Suggest Revision
- You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.
- Explore novel research and implement LLM inference optimizations
- Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
- Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
- Optional) Understanding of GPU architectures or experience with GPU kernel programming using CUDA
Active Job
Updated TodaySimilar Job
Relevance
Active