Lead AI Engineer with Retirement & Wealth Domain (Windsor)
Role: Lead AI Engineer with Retirement & Wealth DomainLocation: Boston, MA or Windsor, CTKey Skill: AI, LLM, API, MLOps, Retirement & Wealth DomainExperience: 10+ yearsMode of Hire: Full TimeResponsibilities:-Architecture & Technical DesignHands-On EngineeringMLOps & Production ReliabilityTechnical LeadershipExperience• 10+ years of progressive software engineering experience with sustained hands-on contributions (aligned with Citi C14/SVP benchmark for this level).• 3+ years of dedicated experience building LLM-based systems and agentic architectures in production environments — not research or notebook work.• Proven success architecting and delivering multiple enterprise-scale AI solutions into production; can speak to architecture decisions, failure modes encountered, and how systems were improved post-launch.• Prior lead or staff-level role: set technical direction, owned critical systems end-to-end, influenced engineering practices across a team.• Experience delivering AI systems in a regulated environment (financial services, healthcare, or similar) with compliance, audit trail, and governance requirements.Programming & Core Engineering• Rust (required, expert level): production systems development including memory safety, async programming with Tokio, error handling patterns, trait design, and testing — used for performance-critical AI service layers, data pipelines, and backend infrastructure.• TypeScript / Node.js (required): production API services, async/await patterns, type-safe API contracts, and React-based front-end interfaces for advisor and participant-facing tools; full-stack TypeScript capability is expected, not optional.• Solana / Solana programs (required): smart contract development using Anchor or native Solana program model; familiarity with Solana's account model, transaction structure, and program-derived addresses (PDAs) as they apply to on-chain financial data and tokenized retirement or investment products.• Software engineering fundamentals: system design, CI/CD pipeline ownership, testing strategy (unit, integration, contract, eval), resiliency patterns, security practices for AI services, and operational stability.• API development: RESTful and event-driven API design using TypeScript/Node.js or Rust (Axum, Actix, or equivalent); authentication, rate limiting, versioning, and API contracts for AI services consumed by downstream systems.• Data engineering: complex SQL proficiency; data pipeline construction in Rust or TypeScript (dbt, Airflow, Prefect, or equivalent); working with structured financial data at scale; experience with Snowflake, Spark, or similar.• Front-end capability: React with TypeScript to build production-quality interfaces for advisor and participant-facing AI tools — not a specialization, but full ownership of the UI layer is expected.• Databases: vector databases (Pinecone, Weaviate, pgvector, OpenSearch); relational (PostgreSQL, SQL Server); document (MongoDB); caching (Redis).LLM & Generative AI Engineering — Required• Production LLM integration: hands-on experience with OpenAI GPT-4o, Anthropic Claude, Google Gemini/Gemma, and/or AWS Bedrock in user-facing production applications — not just API experimentation.• RAG system design and implementation: vector store selection and configuration, chunking and embedding strategies, hybrid search, re-ranking, and rigorous evaluation (RAGAS, custom eval frameworks, or equivalent).• Prompt engineering at an engineering level: system prompt design for financial services safety constraints, few-shot construction, structured output extraction (JSON/XML), prompt version control, and regression testing.• Agentic AI architecture: tool use and function calling; multi-step reasoning chains; agent orchestration frameworks (LangGraph, LangChain, Google ADK, AutoGen, CrewAI, or custom implementations); MCP (Model Context Protocol) server design and integration for financial data sources.• LLM evaluation: building eval suites for correctness, hallucination, instruction-following, and task-specific quality; LLM-as-judge patterns; adversarial robustness testing for financial advice contexts.• Output validation and safety layers: guardrails, output parsers, confidence scoring, fallback logic, and human-in-the-loop escalation patterns for production AI systems handling regulated financial outputs.• ML frameworks: working knowledge of TensorFlow and PyTorch — sufficient to fine-tune, evaluate, and integrate transformer-based models; not required to build from scratch but must understand model mechanics to make architecture decisions.Cloud, Infrastructure & MLOps• Cloud platforms: production experience on AWS, Azure, or GCP — AI/ML services (SageMaker, Azure ML, Vertex AI), serverless compute, managed databases, and storage.• Containerization and orchestration: Docker (required); Kubernetes working knowledge; experience deploying AI inference services in containerized environments with auto-scaling.• MLOps: experiment tracking (MLflow, Weights & Biases, or equivalent); model versioning; deployment pipelines for AI systems; CI/CD for model updates with automated quality gates.• Observability: logging, tracing, and metrics for AI services (Datadog, CloudWatch, OpenTelemetry, or equivalent); building dashboards and alerts for model quality, hallucination rates, and system health.