Software Developer/Engineer
Core ExperienceHands-on experience deploying open-source LLMs such as Meta Llama 3 and Mistral / Mixtral in on-prem or private environments (25%)Strong proficiency in Python for LLM inference, prompt engineering, and integration (25%)Experience with CPU-based inference, model quantization, and performance tuning (25%)Vector Databases & RAGPractical experience with open-source vector databases such as Qdrant, Chroma, Milvus, or pgvector (25%)Proven implementation of Retrieval-Augmented Generation (RAG) pipelines (25%)Experience generating and managing embeddings and metadata filtering (25%)Security & GovernanceUnderstanding of data privacy, air-gapped deployments, and enterprise security requirements (25%)Experience implementing access controls and audit logging (25%)Nice to HaveExperience with LangChain or LlamaIndexExposure to Rust, Go, or C for high-performance servicesFamiliarity with Docker and Kubernetes for on-prem deploymentsKnowledge of inference frameworks (e.g., vLLM, llama.cpp, Hugging Face Transformers)Prior work in regulated or enterprise environmentsDeliverablesReference architecture and deployment guidanceWorking prototype (LLM vector DB RAG)Documentation and knowledge transfer to internal teamsFor immediate consideration please click APPLY.