Software Engineer, ML Inference

acceler8 talentMenlo Park, CAMay 11th, 2026

Occupations:

Software DevelopersComputer Systems Engineers/ArchitectsComputer and Information Research ScientistsComputer Occupations, All OtherData Scientists

Industries:

Computer Systems Design and Related ServicesSoftware PublishersFuel DealersComputing Infrastructure Providers, Data Processing, Web Hosting, and Related ServicesBusiness Support Services

Software Engineer, ML InferenceSan Francisco (On-Site)$250,000–$320,000 base + equityWhy this roleEarly-stage infrastructure company building a next-generation AI cloud — rethinking how frontier models run across heterogeneous compute environments.This team is focused on the hardest part of the stack: making large-scale model inference fast, reliable, and production-ready.You’ll own the systems that actually execute models in production — working across runtime, serving infrastructure, memory management, and hardware optimisation.What you’ll doBuild and scale end-to-end inference systems from request → runtime → responseOptimise latency, throughput, concurrency, and reliability under real production workloadsDesign batching, scheduling, and queuing systems for high-performance servingImprove KV cache management and memory efficiency at scaleDebug performance bottlenecks across model, runtime, and hardware layersWork closely with systems, infrastructure, and ML teams to push inference performance forwardWhat makes this interestingDeep work on LLM inference internals including prefill, decode, and attention optimisationSolving real-world trade-offs between tail latency and throughputOptimising workloads across GPUs and next-generation acceleratorsHands-on work with vLLM, TensorRT-LLM, and custom inference runtimesOpportunity to shape core infrastructure at an early-stage companyWhat they’re looking forExperience building ML inference or model serving systemsStrong systems engineering or backend infrastructure fundamentalsExperience working on performance, scaling, memory, or distributed systems challengesStrong Python and/or C++ skillsFamiliarity with modern inference frameworks and runtimes is a plusAPPLY NOW!

matching similar jobs near Menlo Park, CA

Software Engineer Intern (AI/ML) - Fall 2026
snowflakeMenlo Park, CAMay 11th, 2026
Occupations: Software DevelopersData ScientistsComputer Systems Engineers/Architects
Industries: Computer Systems Design and Related ServicesSoftware PublishersFuel Dealers
Software Engineer
glocommsSanta Clara, CAMay 14th, 2026
Occupations: Software DevelopersComputer Systems Engineers/ArchitectsComputer Programmers
Industries: Software PublishersComputer Systems Design and Related ServicesFuel Dealers
Inference Software Engineer
etchedCupertino, CAMay 20th, 2026
Occupations: Software DevelopersComputer Systems Engineers/ArchitectsComputer and Information Research Scientists
Industries: Computer Systems Design and Related ServicesSoftware PublishersFuel Dealers
Sr. Software Engineer
ruckus networksSunnyvale, CAMay 11th, 2026
Occupations: Software DevelopersComputer Systems Engineers/ArchitectsComputer Programmers
Industries: Computer Systems Design and Related ServicesSoftware PublishersFuel Dealers
Software Engineer (Cupertino)
synergisCupertino, CAMay 16th, 2026
Occupations: Software DevelopersComputer Systems Engineers/ArchitectsComputer Systems Analysts
Industries: Computer Systems Design and Related ServicesSoftware PublishersFuel Dealers