JOBSEARCHER

GenAI engineer

Dice is the leading career destination for tech experts at every stage of their careers. Our client, Infojini, is seeking the following. Apply via Dice today!Core ExperienceConsultant Requirements – On-Prem LLM & Vector DB ImplementationHands-on experience deploying open-source LLMs such as Meta Llama 3 and Mistral / Mixtral in on-prem or private environmentsStrong proficiency in Python for LLM inference, prompt engineering, and integrationExperience with CPU-based inference, model quantization, and performance tuningVector Databases & RAGPractical experience with open-source vector databases such as Qdrant, Chroma, Milvus, or pgvectorProven implementation of Retrieval-Augmented Generation (RAG) pipelinesExperience generating and managing embeddings and metadata filteringSecurity & GovernanceUnderstanding of data privacy, air-gapped deployments, and enterprise security requirementsExperience implementing access controls and audit loggingNice to HaveExperience with LangChain or LlamaIndexExposure to Rust, Go, or C++ for high-performance servicesFamiliarity with Docker and Kubernetes for on-prem deploymentsKnowledge of inference frameworks (e.g., vLLM, llama.cpp, Hugging Face Transformers)Prior work in regulated or enterprise environmentsDeliverablesReference architecture and deployment guidanceWorking prototype (LLM + vector DB + RAG)Documentation and knowledge transfer to internal teams