Senior DevOps Engineer / Site Reliability Engineer (SRE)
Job Description
Senior DevOps Engineer / Site Reliability Engineer (SRE)About the RoleWe're hiring a Senior DevOps Engineer / Site Reliability Engineer (SRE) to help architect and scale the unified global operations platform behind a fast-growing cross-border e-commerce SaaS company.Do you have the skills to fill this role Read the complete details below, and make your application today.This is a newly created, high-impact role supporting the company's North America engineering operations. You'll work closely with technical leadership, platform experts, and executive leadership to build resilient infrastructure, modern observability systems, intelligent automation, and scalable cloud-native operations.The environment is highly collaborative, technically ambitious, and focused on long-term platform reliability and modernization.This is a remote-first opportunity open to candidates located in the United States.About the CompanyThe company is a leading B2B SaaS platform serving the cross-border e-commerce industry.As the business continues expanding globally, the engineering organization is investing heavily in modern DevOps, platform engineering, and reliability infrastructure to support increasing scale, automation, and operational complexity.The company operates in a fast-moving international environment with close collaboration across North America and overseas engineering teams.What You'll DoDesign, build, and maintain unified operations and platform management systemsDevelop infrastructure supporting resource management, monitoring, alerting, configuration management, and automated operationsBuild and operate observability platforms and CI/CD pipelinesDevelop self-healing systems and automated incident response capabilitiesEstablish DevOps standards, tooling strategies, and engineering best practicesSupport engineering and product teams with platform-level technical expertiseLead infrastructure modernization and architecture improvement initiativesReduce technical debt and improve operational reliabilityPromote SRE principles and reliability engineering practices across teamsConduct technical research and evaluate emerging cloud-native technologiesDrive continuous improvement across DevOps and platform engineering workflowsRequired QualificationsCurrently based in California or North CarolinaUS Citizen or Green Card holder (no sponsorship available)Fluent in Mandarin Chinese for day-to-day collaboration with overseas engineering teamsBachelor's degree in Computer Science or related field4–6 years of experience in DevOps, SRE, or Platform EngineeringStrong experience with AWS, Azure, or GCPDeep understanding of cloud infrastructure including VPC, EC2, Kubernetes/EKS, RDS, and IAMStrong Linux systems and networking knowledgeExperience with Docker, Kubernetes, load balancing, and service governanceExperience with Infrastructure as Code tools such as Terraform, Ansible, and HelmExperience building CI/CD pipelines using Jenkins, Argo CD, CodeBuild, or similar toolsExperience with observability and monitoring platforms including Prometheus, Grafana, ELK, and OpenTelemetryProficiency in at least one scripting or programming language such as Python, Shell, or GoStrong troubleshooting, systems analysis, and problem-solving skillsStrong cross-functional communication and collaboration abilitiesPreferred QualificationsMaster's degree in Computer Science or related fieldExperience supporting global or multi-cloud platformsExperience leading observability, self-healing, or platform modernization initiativesExperience with service mesh, chaos engineering, or capacity planningGo development experienceStrong track record improving system reliability, automation, and operational efficiencyExperience collaborating across international and cross-cultural engineering teamsSelf-driven mindset with strong technical leadership and knowledge-sharing abilitiesCompensation & BenefitsCompensationBase Salary: $140,000 – $160,000 USDExceptional candidates may receive compensation above the posted rangeBenefits401(k) with dollar-for-dollar match up to 4%Medical insurance12 days PTO annuallyWork EnvironmentRemote-first work environmentHome base available in Silicon Valley, CA or Raleigh, NCStandard Monday–Friday scheduleNo business travel requiredImmediate hiring needWhy This Role Stands OutThis is an opportunity to help modernize and scale the operational backbone of a global SaaS platform serving a rapidly growing international market. xhyhwjd You'll work on cloud-native infrastructure, automation, observability, and reliability engineering initiatives with significant visibility and ownership across the organization.