L 5/DOD Site Reliability Engineer
Job Description
Make sure to apply with all the requested information, as laid out in the job overview below.We are seeking a Site Reliability Engineer who will play a leading role in multi-engineer collaborations to deploy and secure applications and runtime environments in various cloud and hybrid environments according to zero-trust principles utilizing Infrastructure as Code (IaC). The Level 2 Site Reliability Engineer shall possess the following capabilities:Command of core cloud infrastructure deployment patterns for workload environments, including storage buckets, managed database services, cloud data processing services, and integration with application workloads using workload identity, targeted IAM role bindings, alignment of resource hierarchies according to zero trust principles, and organizational policies for secure and compliant resource accessUnderstanding of application maturation requirements for cloud native deployments, and the ability to effectively articulate workload transformation or maturation requirements to stakeholdersExpertise in Terraform state management, including remote state storage and lockingDesign and implement standardized runtime environments in Kubernetes based ecosystems, leveraging GKE Enterprise, EKS, or AKS Fleet Management features for security and scalabilityConfiguration and management of service mesh (Istio) for traffic management, security, and observability within Kubernetes runtime environmentsEnsure the security of the runtime environments by implementing appropriate security policies, network controls, and access managementIntegrate GitLab CI pipelines with CD tooling (e.g., Flux CD) for automated deployments and rollbacks within Kubernetes runtime deploymentsDeep understanding of VPC networking, security, and identity management servicesOptimization of runtime environments for performance, scalability, and cost-efficiency, collaborating with application teams to understand their needs, including capacity planning and resource managementExpertise in the development of cost-effective SLAs, SLOs, and SLIs in collaboration with workload ownersDemonstrate readability and accountability via code reviews (performing as a co-maintainer) within the limited areas of the codebase to ensure all code is written in a clear, consistent, and idiomatic style in alignment with all governance and compliance requirementsWorking knowledge of advanced zero-trust concepts and their implementation in Azure, GCP, or AWSEducation/Experience:Bachelor’s degree and at least 5 years of relevant experience, including 2 years in reliability, distributed systems, or platform development; OR 8 years of relevant experience without a degree; OR Equivalent combination of education and experience.Salary Range: $140,000-$160,000. This represents the typical salary range for this position, but is not guaranteed. Salary is based on experience, location and contractual requirements which could fall outside of the range listed., Required Skills5 years of experience in the domain items listed below:Cloud Architecture - Deep expertise in distributed systems design, scalability, and fault tolerance, including multi-region, multi-cloud deployments with high availability.Cloud Infrastructure Management - Mastery of Infrastructure as Code (Terraform, DSC, ARM) for complex environments. Building self-healing systems and advanced automation pipelines to eliminate toil. Expertise in CI/CD pipelines and GitOps workflows (i.e. Flux)Application Modernization - Advanced experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes, Docker), advanced proficiency in Kubernetes, Docker, Helm, and service mesh (Istio/Anthos)Monitoring Response Experience - Designing organization-wide monitoring strategies, leading incident response frameworks and ensuring rapid recovery across critical services, driving adoption of SLOs, SLIs, and error budgets as business-aligned reliability metricsZero Trust - Comprehensive understanding of Zero Trust strategic architecture principals, Desired SkillsGoogle Cloud Certified Professional Cloud Architect and/or Azure Solutions Architect Expert preferred., About Tensley Consulting, Inc.About TensleyTensley Consulting is a Service-Disabled Veteran-Owned Small Business focused on mission engineering in support of the United States Intelligence Community and the Department of Defense. Our team consists of System Engineers, Software Engineers, Test Engineers, and Signals Analysts performing work throughout the Continental United States (CONUS) and Outside the Continental United States (OCONUS).Equal Opportunity, Diversity InclusionWe aim to build a team that represents a variety of backgrounds, perspectives, and skills. We embrace inclusion and ensure equal employment opportunity without discrimination or harassment based on race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity or expression, age, disability, national origin, marital or domestic/civil partnership status, genetic information, citizenship status, military or veteran status, or any other personal characteristic. xaygatp Benefits Include100% paid medical coverage with HSA and company contribution100% paid vision, dental, short-term, and long-term premium12% 401(k) contribution (not a match)Education and training budget6 weeks and 3 days of PTOAnd much more!Come grow with us!