JOBSEARCHER

Hardware–Machine Learning Systems Engineer

We are working with a leading global trading firm to deploy machine learning directly onto custom hardware, and we're seeking engineers and researchers to help build this capability from the ground up. This initiative offers a rare opportunity to design end‑to‑end ML systems-spanning model representation, compiler infrastructure, and hardware execution-operating in one of the most performance‑critical computing environments in the world.The team owns the full technology stack, from silicon through compilers and runtime systems. This end‑to‑end ownership enables deep optimization and rapid iteration: when performance bottlenecks arise, they can be addressed directly at the compiler, system, or hardware level. If you are motivated by pushing the limits of latency, throughput, and efficiency in real‑world ML systems, this role offers both technical depth and immediate impact.Candidates from systems, compilers, hardware, or applied ML backgrounds are encouraged to apply. Prior experience in trading or finance is helpful but not required.Core ResponsibilitiesDesign, implement, and optimize ML compiler pipelines that lower high‑level models into efficient, hardware‑aware execution for custom accelerators, FPGAs, or ASICsTreat hardware constraints-latency budgets, memory bandwidth, resource utilization, and numerical precision-as first‑class considerations throughout the compilation processCollaborate closely with ML researchers, systems engineers, and hardware architects to co‑design models, compiler IRs, and hardware interfacesDevelop compiler passes for quantization, operator fusion, scheduling, memory layout, and parallelizationTranslate model and workload requirements into actionable insights that inform both compiler architecture and future hardware designPrototype, benchmark, and deploy ML inference pipelines from proof‑of‑concept through production environmentsTrack and evaluate emerging research in ML compilers, machine learning systems, and quantization, identifying techniques that lead to measurable system‑level improvementsSkills and ExperienceStrong foundation in compiler or systems engineering, with an emphasis on performance optimization and hardware targetsExperience mapping ML workloads onto constrained or latency‑sensitive hardware environmentsFamiliarity with ML compiler infrastructure such as MLIR, TVM, XLA, or similar frameworksExperience with quantization, fixed‑point arithmetic, or reduced‑precision inferenceProficiency in Python, C++, or similar languages for compiler development, tooling, testing, and benchmarkingSolid understanding of machine learning fundamentals, including neural network architectures and inference optimizationStrong communication skills and the ability to collaborate across research, systems, and hardware teams5+ years of professional experience designing, implementing, and optimizing ML compiler pipelinesNice to HaveExperience integrating ML compilers with hardware backends such as FPGAs, custom accelerators, or ASICsExposure to hardware design or ML‑to‑hardware flows (e.g., HLS tools, RTL, hls4ml, FINN, Vitis AI)Background in latency‑critical or resource‑constrained systems such as high‑performance trading systems, real‑time signal processing, scientific instrumentation, or HPCFamiliarity with functional verification or simulation methodologies (e.g., SystemVerilog, UVM, Cocotb)Advanced degree (MS or PhD) in Computer Science, Electrical Engineering, Physics, or a related field-or equivalent industry or research experience
Hardware–Machine Learning Systems Engineer at Selby...