Principal Machine Learning
Job Description: Establish engineering standards, best practices, and evaluation frameworks for AI systems
Lead technical decision-making for model selection, system design, and deployment strategies
Act as the subject matter expert for agentic AI and modern LLM-based systems within the organization
Architect and deliver production-grade, multi-step AI agents capable of autonomous reasoning, tool orchestration, task decomposition, memory management, and human-in-the-loop escalation
Design and deliver AI systems on enterprise cloud platforms (e.g., AWS, Azure), including LLM services (AWS Bedrock, Azure OpenAI)
Own the agent evaluation and observability stack, including benchmarking, tracing, regression testing, and performance monitoring
Optimize LLM inference costs and resource utilization for production workloads
Partner with business leaders to identify, prioritize, and shape AI-driven initiatives aligned with organizational goals
Translate complex business problems into scalable AI solutions with measurable impact
Drive roadmap planning and investment decisions related to AI and automation
Collaborate with IT, data engineering, and operations teams to integrate AI solutions into enterprise systems
Mentor and develop machine learning engineers and data scientists
Provide technical guidance and elevate team capabilities in modern AI practices
Ensure responsible and compliant use of AI systems, including managing risks related to model behavior, data usage, and regulatory considerations in a highly regulated industry
Lead evaluation and integration of external AI platforms and vendors, including assessment of cost, intellectual property, scalability, security, and long-term architectural impact
Requirements: Master's degree (or higher) in Computer Science, Engineering, Statistics, or related quantitative field
10+ years of hands-on experience in machine learning, AI, or related disciplines
2+ years of recent experience architecting and delivering LLM-based and agentic AI systems in production
Proven track record of delivering end-to-end AI solutions, from problem definition through production deployment
Strong programming skills in Python and experience with modern ML frameworks (e.g., PyTorch, TensorFlow)
Benefits: Flexible work arrangements
Professional development opportunities