Senior AI Engineer
Senior AI Engineer – ML & Generative AIRole OverviewWe are seeking a hands-on Senior AI Engineer with a strong foundation in traditional Machine Learning and practical, real-world experience building and deploying LLM- and GenAI-driven systems. This role focuses on designing, engineering, and hardening production-grade AI solutions that are embedded into business workflows—not research prototypes.You will work in small, high-impact delivery teams (2–3 engineers per initiative) and spend the majority of your time (~70–75%) building systems end to end, while also contributing to solution design, technical decision-making, and cross-functional collaboration.Key ResponsibilitiesAI Solution Design & Problem SolvingPartner with business and product stakeholders to translate real-world problems into practical AI solutions.Determine when to apply: Traditional ML approaches (classification, regression, clustering, recommendation systems)LLM / GenAI approaches, including agentic workflowsEvaluate and communicate trade-offs across accuracy, cost, latency, scalability, and operational complexity.Design iterative AI workflows and propose alternative solution approaches where applicable.Hands-on Engineering & Delivery (70–75%)Build and own end-to-end AI systems, including: Data ingestion and processing pipelinesFeature engineering and prompt constructionML and LLM integration and orchestrationAPI-based AI services for downstream consumptionDeploy and harden production AI systems with: Error handling and fallback mechanismsGuardrails, safety controls, and exception handlingObservability (logging, metrics, tracing, dashboards)Ensure production readiness through: Performance tuning and latency optimizationCost management and optimization strategiesScalability and reliability planningImplement AI system controls such as: Input validation and prompt injection mitigationConfigurable policies and kill switchesTransition PoCs into production-grade systems through refactoring, testing, and system hardening.ML & Generative AI ExpertiseApply strong fundamentals in traditional ML, including supervised and unsupervised learning techniques.Build and deploy GenAI solutions, with experience across at least one or two real-world LLM implementations.Work with modern LLMs (e.g., OpenAI, Claude, Gemini, Llama or equivalent models).Design and implement RAG (Retrieval-Augmented Generation) architectures.Apply prompt engineering, evaluation techniques, and iterative optimization.Build and evolve tool-based and agentic workflows, including multi-agent systems.Use agent orchestration frameworks (e.g., LangChain, LangGraph, or equivalent custom systems).Collaboration & Technical Leadership (25–30%)Act as a senior technical contributor within small delivery teams.Debug complex AI system behavior and production issues beyond prompt-level tuning.Contribute to architectural and design decisions alongside architects and platform teams.Collaborate closely with: Product managers and business stakeholdersPlatform, cloud, and infrastructure teamsUphold strong software engineering practices and delivery discipline.Required Skills & ExperienceSoftware & Systems Engineering10-12 years of overall software engineering experience, including prior work as an ML Engineer or equivalent.Strong backend development skills (Python, Java, Node.js, or similar languages).Experience designing and building REST or gRPC-based services.Solid understanding of distributed system design.Containerization and orchestration experience (Docker, Kubernetes).AI / MLHands-on experience across traditional ML and modern GenAI systems.Proficiency with ML frameworks such as scikit-learn, PyTorch, TensorFlow, or equivalents.Experience building or deploying: ML-driven production systemsLLM-based applicationsAbility to select ML vs. LLM-driven approaches based on business and operational constraints.Cloud & DevOpsHands-on experience with at least one major cloud platform (AWS, Azure, or GCP).Experience with CI/CD pipelines and deployment automation.Understanding of model, code, and configuration versioning best practices.Observability & Production ReadinessExperience implementing logging, monitoring, and tracing for production systems.Familiarity with system resilience patterns such as: Rate limitingFailover strategiesKill-switch mechanismsProblem Solving & MindsetStrong ability to solve ambiguous, real-world engineering problems.Comfortable working in fast-moving, iterative environments.Ownership mindset with a bias toward practical, scalable solutions.Communication & CollaborationExperience working in cross-functional teams.Ability to clearly articulate technical and business trade-offs, including: LLM vs traditional MLBuild vs buy decisionsSpeed vs robustnessGood to HaveExperience with enterprise AI platforms or internal AI frameworks.Prior production experience with: Agentic architecturesMulti-agent systemsRAG-based systems at scaleExposure to AI governance, safety, and compliance considerations.Experience mentoring junior engineers or owning technical modules.Hands-on experience optimizing performance and cost for AI workloads.