Software AI Engineer - US
Software AI Engineer - US1Job Title: Software AI Engineer/ArchitectLocation: Santa Clara, CA (onsite preferred but remote candidates can be considered)Experience: 8- 10 yrsJob Type: Contract/ FTEThis role requires deep, end-to-end understanding of how Large Language Models are built, trained, optimized, deployed, and operated.Candidates must demonstrate hands-on experience beyond consuming hosted LLM APIs, with a strong grasp of the underlying ML theory, system trade-offs, and production realities of AI/ML solutions.Mandatory Competency Areas (Non-Negotiable) Foundations of LLMs (How They Actually Work)Candidate Must Demonstrate First-principles Understanding, IncludingTransformer architectures (attention, embeddings, positional encoding)Tokenization strategies and their impact on cost & performanceTraining vs inference behaviorLoss functions, pre-training objectives, and alignment techniques (SFT, RLHF)Limitations: hallucinations, bias, context collapse, long-range degradation Model Development & AdaptationHands-on Experience WithPre-training vs fine-tuning trade-offsParameter-efficient tuning (LoRA, QLoRA, adapters)Quantization and pruning techniquesModel evaluation beyond accuracy (task fitness, safety, robustness)Data curation, labeling strategies, and contamination risks. Model Development & Adaptation Inference, Serving & OptimizationStrong Understanding OfInference pipelines and token generation mechanicsKV caching, batching, streaming responsesThroughput vs latency trade-offsMemory constraints and GPU utilization strategiesModel parallelism (tensor, pipeline) and their failure modes End-to-End AI/ML System DesignAbility To Architect Complete AI Solutions, IncludingData ingestion and preprocessing pipelinesTraining / fine-tuning workflowsModel registry, versioning, and lineageDeployment strategies (canary, A/B, shadow traffic)Feedback loops for continuous improvement Retrieval, Memory & Tool-Augmented SystemsIn-depth Experience WithRetrieval-Augmented Generation (RAG) designEmbeddings lifecycle managementVector databases and hybrid retrievalPrompt/tool orchestration and agentic workflowsFailure modes of RAG and mitigation strategies MLOps, Observability & ReliabilityStrong Ownership Mindset For Production AIMonitoring model quality drift and regressionsDebugging hallucinations and retrieval failuresLogging prompts, responses, and model metadataCost tracking and optimization (token economics)Incident response for AI systems Security, Ethics & GovernanceClear Understanding OfPrompt injection and data leakage risksTraining data privacy and IP protectionModel abuse, misuse, and guardrailsRegulatory and compliance considerationsResponsible AI principles in production systems