Software Engineer, Infrastructure Platform
About FluidstackWe exist to make humanity more free. For most of human history, you farmed or you starved. Technology gave people more time for the things they wanted to do, instead of things they had to do. Powerful AI will be the biggest lever for human choice we've ever built - but only if models are aligned with what humanity actually wants. There are groups building AI who don't share these goals. Whoever deploys frontier compute infrastructure fastest will decide whether AI expands human freedom or shrinks it.We're singularly focused on delivering 10 to 100s of GWs of compute faster than anyone else, rethinking every layer of the stack. We acquire power, design and build data centers, and operate them - with teams spanning hardware and software. Speed and scale are our key differentiators. Come be a part of building civilization-scale infrastructure for AI.We hire people who care deeply about this problem space. If that is you, please apply!About The RoleFluidstack, a leading cloud provider, is looking for a Software Engineer, Infrastructure Platform to build the foundational platforms that enable our global infrastructure and data center operations. You'll develop comprehensive internal tooling across multiple domains—CMDB, asset management, DCIM, monitoring and observability, security, and operational automation—that streamline how we deploy, manage, and operate infrastructure at scale. Working cross-functionally with engineering, operations, data center teams, and product, you'll deliver scalable, reliable, user-friendly solutions that directly impact our ability to grow and deliver world-class infrastructure services.FocusInfrastructure Platform DevelopmentDesign and build our next-generation CMDB system as the authoritative source of truth for infrastructure assets, network topology, and configuration dataCreate DCIM platforms for rack operations, server/GPU deployment, OS installation, quality assurance, and white-screen operationsDevelop end-to-end asset lifecycle management systems covering receiving, racking, inventory, break-fix, and decommissioning workflowsBuild monitoring and observability platforms integrating telemetry from BMS, EPMS, and IT devices with intelligent alarming and incident managementCreate self-service portals and automation for new region bootstrap, day-2 operations, and fleet-scale managementOperational Excellence & AutomationEliminate manual toil through workflow automation and self-service tooling that empower operations and engineering teamsBuild workflow orchestration systems for complex multi-step processes spanning incident, problem, and change managementDevelop digital twin visualizations and operational dashboards surfacing actionable insights; partner with data teams on analyticsCreate integration layers connecting internal platforms with external vendors and third-party systemsCross-Functional PartnershipCollaborate with data center operations, system engineering, network engineering, and security teams to understand requirements and deliver high-impact solutionsWork with product and business stakeholders to prioritize features, define roadmaps, and balance competing needsAlign with support and operations teams to ensure platforms scale with organizational growthTechnical LeadershipEvaluate build vs. buy decisions for platform components, weighing in-house development against commercial SaaS and open-source solutions for scalability, cost, and flexibilityChampion modern development practices including CI/CD, infrastructure-as-code, automated testing, and observability-first designParticipate in architecture reviews and design discussions, contributing to technical direction and standardsFoster technical excellence through code reviews, documentation, and knowledge sharingScalability & ReliabilityDesign high-performance, fault-tolerant systems capable of handling thousands of QPS as our infrastructure footprint expandsBuild comprehensive monitoring, logging, and debugging capabilities with robust error handlingImplement data migration strategies and manage upstream/downstream dependencies carefully during platform evolutionOwn projects end-to-end from concept through deployment, ensuring production readiness and operational excellenceAbout You3+ years of professional software development experience building production systemsStrong programming skills in Python, Go, or similar languages with understanding of system design patternsExperience designing and implementing RESTful APIs, data models, and distributed systemsProficiency with relational and NoSQL databases (PostgreSQL, Redis, etc.)Hands-on experience with containerization (Docker) and infrastructure-as-code tools (Terraform, Ansible)Understanding of CI/CD pipelines and modern development workflowsSolid grasp of networking fundamentals (TCP/IP, DNS, HTTP) and Linux/Unix environmentsStrong problem-solving abilities with attention to scalability, reliability, and operational concernsExcellent communication skills—able to convey technical concepts to both technical and non-technical stakeholdersExperience with CMDB systems (NetBox, Device42) or asset management platformsBackground in infrastructure automation, DevOps, or platform engineeringFamiliarity with workflow orchestration frameworks (Temporal, Airflow, Camunda)Knowledge of monitoring and observability stacks (Prometheus, Grafana, OpenTelemetry)Experience with time-series databases and data visualizationUnderstanding of ITSM frameworks (ITIL) and service management practicesExperience in data center operations, facilities management, or physical infrastructureContributions to open-source infrastructure projectsBachelor's degree in Computer Science or equivalent practical experienceSalary & BenefitsCompetitive total compensation package (salary + equity).Retirement or pension plan, in line with local norms.Health, dental, and vision insurance.Generous PTO policy, in line with local norms.The base salary range for this position is $200,000 - $250,000 per year, depending on experience, skills, qualifications, and location. This range represents our good faith estimate of the compensation for this role at the time of posting. Total compensation may also include equity in the form of stock options.We are committed to pay equity and transparency.Fluidstack is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Fluidstack will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.You will receive a confirmation email once your application has successfully been accepted. If there is an error with your submission and you did not receive a confirmation email, please email careers@fluidstack.io with your resume/CV, the role you've applied for, and the date you submitted your application-- someone from our recruiting team will be in touch. Compensation Range: $200K - $250K
No matching similar jobs found for matching similar jobs near Menlo Park, CA
No similar jobs found