Cloud Systems Engineer
ARCHIVED
We can't find an active application page for this role right now. It may reopen or be listed elsewhere. Use Next Steps to search for an active apply link and similar live jobs.
OverviewThis is a hybrid role - 2 days remote and 3 days in the Malvern, PA office.We are seeking a highly skilled Site Reliability & Cloud Systems Engineer to design, build, and operate scalable, secure, and highly automated cloud platforms in AWS. This role combines hands-on reliability engineering with cloud architecture and automation expertise, with a strong emphasis on building immutable infrastructure and improving system resilience.You will play a key role in evolving our AWS ecosystem into a “push-button” platform—reducing manual operations, embedding security into every layer, and ensuring production systems are observable, performant, and self-healing. This role is well-suited for a proactive engineer who excels at the intersection of infrastructure, automation, and system reliability, blending responsibilities across SRE, DevOps, and Cloud Engineering.Who We AreAt CubeSmart, we’re intentional about culture. You can experience it everywhere from our mission statement of “genuine care” to our “It’s What’s Inside That Counts” tagline to calling each other “teammates” rather than employees. This spirit fosters a fun and collaborative environment that has resulted in our rapid growth and being recognized amongst the top in our industry.CubeSmart’s award-winning team is made up of people who genuinely care. Teammates care about our customers and the life events and/or business needs they are facing. Teammates are passionate, responsible and understanding. The CubeSmart team is made up of people who have a can-do attitude, are committed to their own success and the success of the company, and lead by example.If this sounds like a team and culture that matches your personal values and motivations, we want to hear from you.ResponsibilitiesReliability, Performance & OperationsEnsure uptime, reliability, and performance of AWS-hosted, Linux-based (Ubuntu) production systems and associated lower environmentsBuild and optimize observability using tools like Datadog, CloudWatch, Prometheus/Grafana, and PagerDutyWorking closely with the Dev teams, you will be diagnosing site issues, mitigating impact, and restoring system reliability while communicating clearly with stakeholders.Lead incident response, root cause analysis, and post-incident reviewsParticipate in on-call rotations and support 24/7 production environmentsCloud Architecture & AutomationArchitect and implement fully automated, ephemeral, and immutable AWS production and lower environmentsDesign scalable, resilient distributed systems using AWS best practicesEliminate manual processes through Infrastructure as Code (Terraform, Ansible, Packer)Build and maintain CI/CD and GitOps workflows (Jenkins, GitHub Actions, GitLab CI, ArgoCD/Flux)Develop automation and tooling using Python and Bash to reduce operational toilInfrastructure & Platform EngineeringDeploy and manage AWS services including EKS, ECS, Fargate, Lambda, and RDS (Aurora PostgreSQL), Opensearch, Redis,ElasticacheDesign and manage networking components such as Transit Gateways, load balancers, and service meshesImplement caching, microservices, and distributed system design patternsSecurity & GovernanceArchitect and implement zero-trust security models using IAM, SCPs, and OIDCEmbed security into CI/CD pipelines using SAST/DAST tools (e.g., Snyk) Ensure compliance through automated auditing, backup strategies, and governance controlsCollaboration, Leadership & StrategyPartner with development, security, and operations teams to build reliable, observable platformsDocument systems, runbooks, and operational proceduresDrive FinOps initiatives for cost optimization and forecastingIntegrate infrastructure changes into ITIL-compliant workflows (e.g., Freshservice)Influence architectural decisions and promote engineering best practices across teamsQualifications6–10+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering rolesDeep hands-on expertise with AWS services and cloud architectureStrong Linux systems engineering experience (Ubuntu preferred)Proven experience with Infrastructure as Code (Terraform, Ansible, etc.)Experience building and maintaining CI/CD pipelinesProficiency in scripting/programming (Python, Bash)Hands-on experience with monitoring and observability platformsSolid understanding of cloud security principles (IAM, KMS, Secrets Management, Ansible Vault, Hashicorp Vault)Bachelor’s degree or equivalent practical experiencePreferred QualificationsExperience with containerization and orchestration (Docker, Kubernetes, EKS/ECS)Familiarity with GitOps tools such as ArgoCD or FluxExperience with SAST/DAST tools and secure SDLC practicesKnowledge of distributed systems, caching, and microservices architecturesExperience with FinOps and cost optimization strategiesExposure to ITIL processes and service management platforms