<Back to Search
Site Reliability Engineer (Space Communications)
About Northwood:Northwood is on a mission to transform connectivity between earth and space and bring the benefits of space to the masses through innovations in space communications technologies. If you like building quickly and seeing your work deployed in locations around the globe with real impact, we want you at Northwood.Role:Northwood is looking for an Infrastructure Engineer to help build and maintain our observability infrastructure and ensure our global space communications network operates reliably. As we rapidly scale our operations and establish ground stations around the world, we need someone who can grow with us while building robust monitoring and logging systems and supporting our development teams with reliable CI/CD pipelines.You'll be responsible for building and maintaining our observability and monitoring infrastructure, while working closely with engineering teams to improve system reliability and deployment processes. This role offers significant growth opportunities as we scale, and you'll collaborate with experienced engineers to establish monitoring best practices and incident response procedures. We're seeking someone with 2-4 years of experience who thrives in a fast-paced startup environment and is excited to take on diverse infrastructure challenges.Responsibilities:Build and maintain observability stack with tools like Grafana, Prometheus, Loki, Vector, CloudWatch, VictoriaMetrics, etc. for metrics and log ingestion across environmentsSupport and improve CI/CD pipelines using GitLab and ArgoCD, collaborating with development teams on deployment best practicesHelp build and maintain cloud infrastructure using Terraform on AWS, contributing to the scalability and reliability of our space communication systemsWork with senior engineers to establish monitoring strategies, alerting, and incident response proceduresDeploy and manage Kubernetes applications using Helm charts, with focus on reliability and developer experienceCollaborate with engineering teams to implement performance monitoring and troubleshooting across microservicesSupport identity and access management integration with Okta and HashiCorp VaultAssist in managing NixOS-based infrastructure for reproducible system configurationsParticipate in incident response efforts and contribute to post-incident reviews and improvementsBasic Qualifications:2-4 years of hands-on experience with infrastructure tools and monitoring systems in production environmentsExperience with containerization (Docker, Kubernetes) and basic container orchestrationFamiliarity with CI/CD tools (GitLab, Jenkins, or similar) and infrastructure as code conceptsExperience with cloud platforms (AWS preferred) and basic infrastructure automationProgramming skills in Python or similar language and experience with configuration managementStartup mentality with ability to work in fast-paced, high-growth environments and take on diverse responsibilitiesExperience with logging and metrics collection for production systemsUnderstanding of system reliability principles and interest in learning SRE practicesPreferred Qualifications:Some exposure to observability tools like Vector, Loki, Grafana, Prometheus, or similar monitoring systemsExperience with Terraform or other infrastructure as code toolsFamiliarity with NixOS or other declarative system configuration approachesBasic knowledge of HashiCorp Vault, Okta, or similar identity/secrets management toolsInterest in distributed systems and troubleshooting complex technical issuesPrevious startup experience or demonstrated ability to learn quickly and adaptLinux system administration experienceAWS certification or demonstrated cloud platform knowledgeAdditional Information:To conform to U.S. Government space technology export regulations, including the International Traffic in Arms Regulations (ITAR) you must be a U.S. citizen, lawful permanent resident of the U.S., protected individual as defined by 8 U.S.C. 1324b(a)(3), or eligible to obtain the required authorizations from the U.S. Department of State.
Showing 300 of 42,051 matching similar jobs in Calio, ND
- SENIOR LINUX INFRASTRUCTURE ENGINEER
- Sr. Middleware Support Engineer
- DevOps Engineer
- System Administrator
- Sr. Backend Engineer - Kotlin
- Senior Systems Engineer - Openshift / AKS
- Cactus Wellhead - Systems Infrastructure Manager
- DEVOPS ENGINEER
- Senior Full Stack Software Engineer
- Senior Backend Engineer (Kotlin)
- Cloud Engineer
- Sr. Full Stack Engineer
- Cyber Services Infrastructure Engineer
- Site Reliability Engineering Manager
- DevOps Engineer
- Head of Infrastructure
- Senior Software Engineer, Content Platforms
- Senior GPU Server Product Manager — Cloud Data Center
- Senior Platform Engineer - AI Document Parsing APIMillbrae, CAMarch 20th, 2026
- Full Stack Engineer ID52453
- DevOps Engineering,
- DevOps Engineering,
- Tableau Administrator
- JavaFS / Spring / C# / CSS
- Windows System Administrator
- Kafka Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator
- Systems Administrator