JOBSEARCHER

VP, Platform Architecture, Scalability & AI Automation

IT Leader ??? Platform Architecture, Scalability & AI Automation Role Overview We are seeking a senior Technology Leader to own the architecture, scalability, and intelligent automation of our core FinTech platforms. This role is responsible for designing large-scale, cloud-native, distributed systems while leveraging AI-driven automation to improve platform efficiency, reliability, and operational scale. The ideal candidate combines deep expertise in platform architecture and distributed systems with a strong point of view on using AI to automate infrastructure operations, optimize performance, and enable predictive, self-healing platforms. This is a highly technical leadership role with material influence over how the platform scales as the business grows. Key Responsibilities Platform Architecture & Technical Strategy Own the end-to-end platform architecture supporting core FinTech products and transaction flows Define architectural standards for scalability, performance, resiliency, and system composability Lead evolution from tightly coupled or monolithic systems toward distributed, service-oriented platforms Establish clear system boundaries, ownership models, and architectural governance Define and execute a multi-year platform roadmap aligned with growth, transaction scale, and product velocity Scalability & Distributed Systems Design platforms capable of handling high transaction volumes, burst traffic, and sustained throughput Guide horizontal scaling strategies across compute, storage, data, and messaging layers Lead architectural decisions around sharding, partitioning, caching, asynchronous processing, and concurrency Continuously improve latency, throughput, and resource efficiency across the platform Enable multi-region and multi-environment scalability where required Cloud & Infrastructure Architecture Architect cloud platforms (AWS, Azure, or GCP) optimized for scale, availability, and operational efficiency Define reference architectures for containerized workloads, microservices, and distributed runtimes Lead Kubernetes and container platform adoption and standardization Mature Infrastructure as Code (Terraform, CloudFormation, etc.) for consistent, scalable environments Own capacity modeling, growth forecasting, and infrastructure lifecycle planning AI-Driven Automation & Intelligent Platforms Apply AI and machine learning techniques to automate platform operations and decision-making Use AI for: Capacity forecasting and demand prediction Anomaly detection in platform performance and system behavior Automated root-cause analysis and incident correlation Predictive scaling and infrastructure optimization Drive adoption of self-healing platform patterns where systems can respond automatically to failure or degradation Enable data pipelines, feature stores, and runtime environments required to support AI-enabled platform services Partner with data and engineering teams to productionize AI capabilities within core platform workflows Platform Engineering & Developer Enablement Build shared platform capabilities that abstract complexity and enable product teams to scale independently Provide self-service infrastructure, golden paths, and opinionated platform tooling Standardize CI/CD, runtime environments, observability, and deployment patterns Reduce friction and cognitive load for application teams through strong platform design Measure and improve developer experience as a platform outcome Reliability, Performance & Intelligent Operations Lead SRE practices focused on scalability, automation, and operational maturity Define and track SLIs/SLOs centered on throughput, latency, availability, and platform health Establish advanced observability (metrics, tracing, logging) as inputs to AI-driven insights Lead analysis of scaling failures, performance bottlenecks, and systemic inefficiencies Drive continuous improvement toward predictable, automated, and resilient operations Required Qualifications 10+ years of experience designing and operating large-scale distributed systems 5+ years in senior technical leadership roles (Director, Principal, VP, or equivalent) Deep expertise in platform architecture, cloud-native design, and system scalability Strong hands-on experience with AWS, Azure, or GCP Proven experience with microservices, event-driven architectures, and distributed data systems Solid background in Infrastructure as Code and automation-first platform design Experience applying AI/ML concepts to operational or platform use cases Preferred Qualifications Experience with high-volume transaction processing or real-time systems Strong Kubernetes and container platform experience Experience with event streaming platforms (Kafka or equivalent) Background modernizing legacy platforms at scale Experience with AI-assisted operations, AIOps, or intelligent monitoring platforms Key Competencies Systems-level architectural thinking with a strong scalability mindset Ability to blend platform engineering and AI automation into practical solutions Technical credibility with senior engineers, architects, and leadership Pragmatic decision-maker who balances ideal architecture with real-world constraints Strong communicator who can translate technical strategy into business impact 30-60-90 Day Success Plan First 30 Days ??? Understand & Assess Develop deep understanding of current platform architecture and scaling limits Review system topology, transaction paths, and performance characteristics Identify opportunities for automation, AI-driven optimization, and architectural simplification Build strong relationships across engineering, data, and product leadership Days 31???60 ??? Architect & Automate Define target-state platform architecture with explicit scalability patterns Prioritize architectural improvements with the highest scale and automation leverage Introduce AI-enabled insights into observability, capacity, or incident analysis Establish platform standards, reference architectures, and design principles Days 61???90 ??? Scale & Industrialize Deliver measurable improvements in throughput, latency, and platform stability Advance automation toward self-service, self-scaling, and self-healing capabilities Roll out platform-level AI automation for operations and performance optimization Finalize a multi-year platform and AI-automation roadmap Establish a culture of building intelligent systems designed to scale by default #RT #DICEJOBS