Senior Data Engineer - Boutique Hedge Fund - Greenfield Build
Senior Data EngineerBuild the data foundation of a firm that moves when it sees opportunity.This is a global investment firm with a long track record, permanent capital, and a mandate unconstrained by the conventions that limit most institutional investors. The organization is lean by design, and maintains a high touch, tight knit familial culture distinct on both investment and technology teams. Data infrastructure here is not a cost center. It is how the investment function sees the world. The pipelines you build will feed trading signals, risk systems, and research workflows that are used daily by the people making the firm's most important decisions. The work is consequential, the technical environment is modern, and the mandate to improve it is real.The roleYou will join a data engineering team responsible for the full lifecycle of the firm's data assets, from ingestion and transformation through governance, quality, and discoverability. The scope is broad by design: this team owns the infrastructure that the quant, trading, and risk functions depend on, and the expectation is that you contribute across it.This is not a role for someone who wants to maintain a stable system at the margins. The firm is actively building out its data platform, and the engineer who takes this role will have genuine ownership over meaningful parts of that build.What you will ownPipelines and platformDesign, develop, and optimize data pipelines serving trading, alpha generation, research, risk, and accounting — using Snowflake, DBT, Amazon EMR, and Apache Iceberg for scalable batch and streaming workflowsBuild and maintain real-time event-driven data flows using Kafka to support low-latency analytics, trading signals, and risk systemsOptimize large-scale data processing across EMR and Snowflake for performance, efficiency, and cost at scaleSupport and troubleshoot pipelines, Kafka streams, APIs, and database performance across the full stackGolden source dataBuild and maintain critical firm-wide datasets including security master, account master, and price master — well-governed, versioned, and built on open table formatsDevelop and expand shared Python libraries for data APIs, logging, pipeline utilities, and core functionality used across the engineering teamGovernance and observabilityBuild out and maintain the firm's data governance stack: data quality frameworks, end-to-end lineage tracking, workflow orchestration, and a unified data catalogInstrument pipelines with monitoring, alerting, and automated validation checks to ensure high data quality and observability in productionCollaborationWork closely with quant researchers, traders, risk managers, and non-technical stakeholders to understand data requirements and translate them into sound technical designsCommunicate complex solutions clearly across a team where good ideas travel fast and technical credibility is earnedTechnical environmentPython SQL Snowflake DBT Apache Kafka Amazon EMR Apache Iceberg Docker / Kubernetes FastAPI Spark / Dask Great Expectations Open Lineage DataHubWho you are6+ years of software development experience with at least 2 years focused on data engineeringStrong Python and SQL skills applied to data processing and automation at scaleDeep ETL/ELT experience across both batch and streaming architecturesSolid grasp of data modeling, query performance tuning, and relational database internalsExperience building and deploying containerized applications in cloud environmentsFinancial market data literacy across equities, fixed income, futures, and options is a meaningful advantageExposure to data observability and governance toolingBachelor's degree in computer science or a related field, or equivalent demonstrated experience