Software Engineer, Sensor Data Integration
ARCHIVED
We can't find an active application page for this role right now. It may reopen or be listed elsewhere. Use Next Steps to search for an active apply link and similar live jobs.
The Role At Mach9, Sensor Data Integration Engineers build the algorithms and pipelines that transform large-scale geospatial datasets into structured, accessible formats to power our survey product, Digital Surveyor. You'll work with high-volume data sources - LiDAR-collected point clouds, on-road imagery, overhead aerial ortho photos - and own the systems that ingest, standardize and store them for our training and product use. Every single piece of data that our customers upload will pass through your systems first. This role is ideal for an engineer who loves puzzle-hunting - reverse-engineering sparsely-documented formats, wrangling coordinate systems and transforms, hunting down strange camera projection issues. You'll sit at the divide between our customers and our product, making messy real-world sensor data trustworthy at scale. This role sits at the front of everything we do: our models are only as good as the data feeding them, and you'll be the one making that data trustworthy at scale. Where you'll make an impactOwn the ingestion pipelines that convert point clouds and imagery from hardware vendors into Mach9's standard internal formatReverse-engineer new vendor formats and updates - often working only with sparse or missing documentation - to expand what data Mach9 can take inBuild agentic systems to automatically triage failures and reformat dataBuild automated checks and regression testing to guarantee the consistency of our dataOptimize the performance of our processing and storage across massive geospatial datasets in the cloudWork directly with customers and partners to unblock critical customer projects What you bringStrong software development and debugging skillsExperience building production software in PythonComfort operating with ambiguity. You'll need to be able to dig into undocumented or messy data formats and reverse-engineer them.Strong communication skills, with the ability to work across our ML, product, and customer success teamsA foundation in parallel computing or distributed systemsA bachelor's degree in Computer Science, Engineering, or equivalent experience. Bonus experienceExperience building agentic systems and setting up agent harnesses - orchestrating LLM-driven workflows for triage, debugging, or automated code patching.Understanding of geospatial data formats (e.g., LAS/LAZ, COPC, E57, GeoTIFF, Shapefiles) and tooling (e.g., GDAL, PDAL, untwine, laz-perf).Expertise designing and managing data schemas and storage systems for geospatial data (e.g., Postgres/PostGIS, AWS S3).Experience with large-scale data processing frameworks and cloud platforms (e.g., Spark, AWS Batch).Familiarity with coordinate reference systems and transforms (CRS, WKT, pyproj, affine transforms).Experience building data versioning, lineage, or artifact-tracking systems.Experience operating data pipelines that feed ML training and inference.Familiar with C++. About Mach9 Mach9 is transforming civil infrastructure design with AI-powered geospatial tools. Our platform accelerates the creation of engineering deliverables from raw data, cutting manual drafting time by 96×. Trusted by global leaders in engineering and construction, we're backed by Y Combinator, Quiet Capital, and top founders and executives from Cruise, Autodesk, Adobe, and DoorDash. We believe the needs of a startup benefit from an in-person culture. The team works out of our office in SoMa, with the flexibility to work from home when needed.