JOBSEARCHER

V&V Engineer- AI-Driven Testing & Validation

V&V Engineer – AI-Driven Testing & ValidationPlano, TXKey Responsibilities:Lead end-to-end quality engineering for enterprise AI applications, including LLM-powered products, RAG pipelines, and agentic workflowsDesign and execute prompt validation strategies, evaluating LLM responses for accuracy, semantic relevance, hallucination risk, and safety complianceBuild automated evaluation pipelines for AI model outputs using metrics such as BLEU, ROUGE, embedding-based similarity, precision, recall, and F1-scoreValidate agentic systems (tool use, multi-step reasoning, planner-executor workflows) for correctness, determinism, and failure mode handlingArchitect and maintain Python-based automation frameworks for AI/ML model evaluation, regression testing, and continuous model quality monitoringIntegrate AI testing into CI/CD pipelines for automated evaluation of model updates, prompt changes, and dataset revisionsDevelop reusable test harnesses for prompt regression, golden-set evaluation, A/B comparison, and human-in-the-loop workflowsPerform AI data validation using EDA, schema validation, and cross-validation techniquesConduct bias detection and fairness analysis to ensure responsible AI outcomesDrive model robustness testing including adversarial inputs, distribution shifts, and edge-case stress testingEstablish regression testing standards for retraining and fine-tuning cyclesCollaborate with engineering teams to validate AI solutions and define quality KPIs and acceptance criteriaMentor QA engineers on AI evaluation methodologies and automation practicesPromote responsible AI practices including safety, transparency, and complianceRequired Qualifications:10+ years of experience in Quality Engineering and Test AutomationStrong experience validating AI/ML systems, including Generative AI and LLM-based applicationsProficiency in Python and building automation frameworksExperience with prompt validation, agentic workflow testing, and AI model evaluationKnowledge of evaluation metrics such as BLEU, ROUGE, embedding similarity, precision, recall, and F1-scoreExperience with TensorFlow, PyTorch, LangChain, LangGraph, and LlamaIndexStrong understanding of data validation techniques including EDA, schema validation, cross-validation, and statistical analysisExperience integrating automated testing into CI/CD pipelines (e.g., GitHub Actions, Jenkins, GitLab CI, Azure DevOps)Familiarity with bias detection, fairness assessment, and AI safety evaluationPreferred Qualifications:Experience with vector databases and RAG pipelinesFamiliarity with MLOps tools such as MLflow, Weights & Biases, or similar platformsExperience with LLM observability and evaluation tools (e.g., LangSmith, Ragas, DeepEval, TruLens)Knowledge of cloud AI platforms such as AWS, Azure, or GCPUnderstanding of AI governance frameworks and regulatory standardsBachelor's or Master's degree in Computer Science, Data Science, or related field"All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran."