JOBSEARCHER

Sr/Staff Site Reliability Engineer, Consumer Apps

About AttainBuilt for consumers and companies, alikeKlover's engineering team powers one of the fastest-growing fintech platforms in the U.S., supporting over one million active users each month. Our systems process and move more than $1.5 billion annually, enabling real-time access to financial tools, rewards, and services that help people improve their day-to-day lives.As part of this team, you'll help design, build, and scale the systems that underpin Klover's core products and platform. You'll work on high-impact, production-grade systems that prioritize reliability, security, and performance, and that integrate with a broad ecosystem of internal and external services. The work you do will directly shape how users interact with Klover's products, access their money, and experience transparent, low-fee financial services.Klover engineers collaborate closely with colleagues across backend, frontend, data science, and product teams to deliver scalable, high-quality solutions for a rapidly growing user base. You'll have the opportunity to work with modern technologies and architectures while helping define and evolve the next generation of inclusive, data-powered financial products-building systems and interfaces that emphasize reliability, privacy, and performance at scale.About the RoleAs a Senior/Staff Site Reliability Engineer, you will play a critical role in building out and maintaining the infrastructure that powers all of our systems, as well as all of the supporting tools to ensure that those systems are running smoothly. You will work closely with nearly every engineering team at Attain, in helping to ensure that our systems are operating at peak efficiency, and preparing us to handle the scale of our future growth.Attain Office Hybrid Schedule (where applicable):* Redwood City, CA: Mondays (in-office for stand-ups, all-hands) and choice of three days between Tues-Friday* Chicago, IL & New York, NY: 4 days in-office; 1 day remoteWhat a typical week might look likeWrite Terraform modules for deploying infrastructure resources via our GitLab pipelinesDevelop Helm charts for deploying services and jobs in our Kubernetes clusterDefine metrics, network policies, and routing rules for our Istio service meshMonitor and maintain our GCP BigQuery and Spanner databasesPipe metrics to our Google-managed Prometheus instance and build out Grafana dashboards and alerts to increase visibility on our systemsExperiment with GCP offerings, 3rd party vendors, and open-source tools to further automate and secure day-to-day operationsLeverage latest and greatest LLM models in developing infrastructure and toolingPair with engineering leads to instrument and monitor critical functionalityAdd automation to both existing and new systems to reduce our reliance on manual processesParticipate in architecture design and capacity planning discussions to ensure that our systems are scalable, maintainable, reliable, and secureBuild, maintain, and improve our CI/CD pipelineYou'll be a great fit for the role ifYou are comfortable wearing many hatsYou have a willingness to learn and teach in a fast-paced, collaborative environmentYou have a strong desire to automate thingsYou readily provide constructive feedback, and also proactively seek feedback to improve yourselfYou like to get your hands dirty and tinker with/stress test new technologiesPreferred Qualifications6+ years of experience building and maintaining large-scale cloud-native infrastructure (AWS and/or GCP)Experience working with the containerization technologies Docker, Kubernetes, and Istio or a similar service mesh technologyExperience with SQL database technologies such as MySQL,Google BigQuery, and Google SpannerExperience with stream technologies such as Kafka and Amazon KinesisExperience with pub sub technologies such as AWS SNS and Google Pub/SubExperience with serverless computing technologies such as AWS Lambda and Google Cloud Functions/Google Cloud RunExperience with infrastructure-as-code tools such as TerraformExperience with observability tools such as Datadog, Prometheus, and GrafanaStrong computer science and software engineering fundamentalsExperience with SOC2 Compliance processes and requirementsWe are excited to hear from you.At Attain, we are passionate about finding people to continuously help us grow our organization. We encourage you to apply, even if your experience doesn't match every detail on the job description. If we don't see something that immediately fits, we will keep your resume on file for future opportunities.