Upvote
Downvote
Full - Stack MLOps Systems Engineering Lead
Share Job
- Suggest Revision
- We Are Nextira, now part of Accenture, builds cloud-based solutions and services with cutting-edge engineering skills, artificial intelligence (AI), machine learning (ML), and data analytics that enable clients to design, build, launch and optimize high-performance computing environments.
- An experienced, highly motivated MLOps Engineering Lead looking to join our team in supporting our large-scale GPU-based AI training and research cluster hosted on various cloud providers such as AWS, Azure, and GCP. The ideal candidate should have expert knowledge of Linux at the kernel level and be able to configure and troubleshoot NVIDIA drivers and utilities, particularly on virtual machines running in the cloud.
- You will manage HPC (High Performance Computing) clusters, including schedulers such as Slurm, and compute nodes accelerated with NVIDIA GPUs. You'll help experienced ML engineers configure and manage their Conda environments, optimizing them for their specific AI training and research needs.
- As a MLOps Engineering Lead, you will design, deploy, and maintain cloud infrastructure with infrastructure-as-code (IaC) tools such as Terraform, AWS CDK.
- Minimum of 3 years of experience with at least three of the following: Python, Docker / Kubernetes, C
Active Job
Updated TodaySimilar Job
Relevance
Active