JOBSEARCHER

Senior Data Engineer

SotalentPhoenix, AZMay 7th, 2026
Must be a US Citizen due to contractual requirements.Role OverviewOur client is seeking an experienced Senior Data Engineer to design and deliver advanced data systems that support a wide range of artificial intelligence applications. This role focuses on building scalable data solutions that enable better insights, strengthen decision-making, and power modern AI capabilities across the organisation.You will work closely with leadership in AI and collaborate with cross-functional teams, contributing to the development of systems that support traditional machine learning, generative AI, and emerging autonomous (agent-based) technologies.The position follows a hybrid working model based in either Phoenix, Arizona or Charlotte, North Carolina.Key ResponsibilitiesDevelop and manage data infrastructure supporting various AI use cases, including predictive models, generative systems, and autonomous workflowsBuild scalable pipelines to process diverse data formats such as structured datasets, text, images, audio, video, and system logsCreate feature engineering workflows to prepare data for machine learning modelsDesign pipelines tailored for large language models, including embedding generation, data segmentation, and contextual data preparationSupport data workflows for intelligent systems that rely on real-time processing, event-driven architecture, and persistent data storageImplement solutions using modern cloud and data platforms to enable large-scale data processing and analyticsEstablish and maintain data quality standards, governance practices, and monitoring systemsWork closely with data scientists, engineers, and product teams to deliver datasets for model training, testing, and deploymentContinuously optimise pipelines for efficiency, reliability, and performance across both batch and real-time systemsMLOps & Platform ResponsibilitiesLead the development and operation of machine learning pipelines across the full lifecycle, from training to deploymentBuild reliable workflows for model evaluation, versioning, and production rolloutCollaborate with technical teams to deploy AI models, generative systems, and retrieval-based solutionsDesign systems for distributed training, parameter tuning, and efficient model servingImplement monitoring and validation frameworks to track model performance, detect drift, and ensure complianceManage automated deployment processes for models, data assets, and AI components using modern CI/CD practicesOversee experiment tracking, model lifecycle management, and environment promotion processesEnsure seamless integration between machine learning frameworks and cloud-based infrastructureMaintain high standards for system scalability, reliability, and observabilityDefine best practices, reusable patterns, and standards for AI and data engineering workflowsRequired QualificationsBachelor’s degree in a technical discipline such as engineering, computer science, or a related fieldAt least five years of experience in data engineering, large-scale data systems, or machine learning data pipelinesStrong experience with distributed data processing frameworks (e.g., Apache Spark)Proficiency in Python and SQL, with experience handling large datasetsHands-on experience building cloud-based data pipelines (e.g., AWS or similar platforms)Experience designing data systems that support multiple AI workloads, including model training and inferenceSolid understanding of data modelling, integration, and production-grade pipeline developmentPreferred ExperienceExposure to AI systems at scale, including traditional ML, generative AI, and agent-based solutionsFamiliarity with vector databases and semantic search technologiesKnowledge of preparing data for large language models, including text processing and context structuringExperience working with unstructured data processing techniques (e.g., NLP, OCR, computer vision)Experience with workflow orchestration and AI development platformsUnderstanding of MLOps tools and practices, including experiment tracking and deployment automationAwareness of emerging AI architectures such as memory-driven or tool-integrated systemsStrong analytical thinking, attention to detail, and commitment to data qualityAbility to work effectively in fast-moving, collaborative environments