Senior Cloud & Container Infrastructure Engineer
About UsRCH Solutions is a rapidly growing global provider of computational science expertise within Life Sciences and Healthcare. At RCH, our team rallies around a culture crafted for learning and achieving. We’re relentless in our pursuit for innovation and demanding of ourselves to deliver a ground-breaking computing experience for our clients, so that they can deliver life-saving science to humanity.Core ValuesAt RCH, our Core Values are more than just words—they represent the threads that weave together the fabric of our culture. Used as a guide when interviewing new team members; as a barometer when evaluating our performance as individuals and teams, and even when deciding which customers to work with, RCH’s Values embody the behaviors upon which we measure our success and create a framework for our growth as people and professionals.Our Core Values:Embrace Excellence: We strive for best-in-class delivery of innovation and serviceBe Accountable: Integrity, ownership and accountability are non-negotiableAdventure Together: We are committed to fostering a culture that embraces continuous improvementSucceed as a Team: We believe harnessing the power of a team drives outcomes not achievable by individualsBoundaries and Balance: Work-life balance is a core facet of our cultureIf you share in our core values, then we encourage you to continue reading this posting as you may have found a great home for your career.Job DescriptionRCH Solutions is seeking multiple Senior Cloud & Container Infrastructure Engineer to join our team of scientific computing experts. You will design, implement, automate, and operate scalable, secure, and highly reliable infrastructure that powers mission-critical applications and services. This is a hands-on senior individual contributor role with strong emphasis on Google Kubernetes Engine (GKE), container-native architectures, infrastructure as code, observability, and security best practices in Google Cloud Platform (GCP). You will serve as a GCP subject-matter expert within the team, mentor engineers, and drive platform improvements that enable developer velocity and business scale.If you're passionate about building reliable, scalable, developer-friendly platforms on Google Cloud and solving hard container and infrastructure problems at scale, we'd love to hear from you.Key Responsibilities:Design, deploy, and operate containerized workloads on GKE across enterprise-scale environmentsManage GCP compute resources (Compute Engine, Cloud Run, GKE Autopilot) for high availability and cost efficiencyOperate and scale Weaviate vector database clusters to support production AI and semantic search workloadsOptimize indexing, query performance, and storage configurations as data volumes growCollaborate with AI/ML teams to define schema strategies and ingestion pipelinesBuild and maintain monitoring dashboards and alerting pipelines using GrafanaIntegrate LLM observability tooling (LangFuse / LangSmith) to track model performance, latency, and usage across AI servicesDrive incident response, root cause analysis, and continuous reliability improvementsImplement infrastructure-as-code (Terraform / Deployment Manager) for reproducible, auditable deployments and CI/CD integrationDefine and enforce multitenant GKE architecture: cluster security, namespace/tenant isolation, RBAC, network policies, maintenance, and scalingMentor engineers and drive platform adoption and best practicesAutomate end-to-end provisioning, deployment pipelines, and day-2 operations using CI/CD tools (Cloud Build, GitHub Actions, ArgoCD, etc.)Design and implement observability stacks using Google Cloud Operations Suite (formerly Stackdriver), Prometheus/Grafana, Cloud Logging, Cloud Monitoring, and distributed tracing (Cloud Trace)Troubleshoot complex production issues spanning compute, networking, storage, and Kubernetes layersEssential Qualifications:6+ years of hands-on experience building and operating production cloud infrastructure4+ years of deep, production experience with GCP, particularly in a senior or lead capacity3+ years of strong expertise with Kubernetes in production (preferably GKE), including cluster design, upgrades, troubleshooting, and scalingExpert-level proficiency with Terraform for GCP infrastructure provisioningStrong experience with container technologies: Docker, container registries (Artifact Registry), container security scanningSolid understanding of GCP core services: Compute Engine, Cloud Run, Cloud SQL / AlloyDB, Cloud Storage, BigQuery, Pub/Sub, Cloud Functions, VPC, Cloud Load Balancing, Cloud InterconnectExperience implementing secure IAM strategies, organization policies, and security controls in GCPProficiency in Linux systems administration, networking fundamentals, and scripting (Bash, Python, Go preferred)Experience with modern CI/CD and GitOps practices in cloud environmentExperience supporting or using HPC environments leveraging SLURContainerization/orchestration (Docker, Kubernetes/GKE)Strong understanding of data governance, cataloging, and lineage tools; basic familiarity with regulated environments (GxP, HIPAA)Experience assessing existing code and workflows and identifying bottlenecks and optimization opportunitiesExperience in software requirements gathering, documentation, design, and developmentPreferred Qualifications:Google Cloud Professional certifications (e.g., Professional Cloud Architect, Professional Cloud DevOps Engineer, Professional Kubernetes Engineer)Experience with Anthos, Config Management, Policy Controller, or multi-cluster managementFamiliarity with service mesh (Istio/Envoy), ingress controllers (GKE Gateway API / Ingress), and microservices observabilityAdditional Information:Great talent should benefit from a great work environment. If you join our team, you’ll have access to:A competitive salary and bonus package based on experienceComprehensive health and wellness benefits, including Medical, Dental, and Vision InsuranceCompany-provided Life and Long-Term Disability InsuranceCompany-sponsored 401(k) PlanCompany-provided continuing education benefitTeam-focused culture and unlimited opportunity for advancement**This is a remote position and the candidate is expected to be able to work on an east coast (US) time schedule.Role is only open to applicants not needing sponsorship now or in the future, no third parties pleasePowered by JazzHREmFU7ILG3J