<Back to Search
Reliability Engineer, AI u0026 Data Platforms (AiDP)
Austin, TXMarch 30th, 2026
Join the AI and Data Platforms team at Apple, where we build and manage cloud-based data platforms handling petabytes of data at scale. We are looking for a passionate and independent Software Engineer specializing in reliability engineering for data platforms, with a strong understanding of data and ML systems. If you thrive in a fast-paced environment, love crafting solutions that don't yet exist, and possess excellent communication skills to collaborate across diverse teams, we invite you to contribute to Apple's high standards in an exciting and dynamic setting.As part of our team, you will be responsible for developing and operating our big data platform using open source or other solutions to aid critical applications, such as analytics, reporting, and AI/ML apps. This includes working to optimize performance and cost, automate operations, and identifying and resolving production errors and issues to ensure the best data platform experience.Expertise in designing, building, and operating critical, large-scale distributed systems with a focus on low latency, fault-tolerance, and high availability. Experience with contribution to Open Source projects is a plus. Experience with multiple public cloud infrastructure, managing multi-tenant Kubernetes clusters at scale and debugging Kubernetes/Spark issues. Experience with workflow and data pipeline orchestration tools (e.g., Airflow, DBT). Understanding of data modeling and data warehousing concepts. Familiarity with the AI/ML stack, including GPUs, MLFlow, or Large Language Models (LLMs). A learning attitude to continuously improve the self, team, and the organization. Solid understanding of software engineering best practices, including the full development lifecycle, secure coding, and experience building reusable frameworks or libraries.3+ years of professional software engineering experience with large-scale big data platforms, including strong programming skills in Java, Scala, Python, or Go. Proven expertise in designing, building, and operating large-scale distributed data processing systems with a strong focus on Apache Spark. Hands-on experience with table formats and data lake technologies such as Apache Iceberg, ensuring scalability, reliability, and optimized query performance. Skilled at coding for distributed systems and developing resilient data pipelines. Strong background in incident management, including troubleshooting, root cause analysis, and performance optimization in complex production environments. Proficient with Unix/Linux systems and command-line tools for debugging and operational support.
21,184 matching similar jobs at Interstate Moving Relocation Logistics
- IT Systems Administrator
- IT Infrastructure & Network Specialist
- Senior Systems Administrator - Automation & Incident Response
- Plano AWS Cloud Engineer - Infra Automation Expert
- Virtualization / Compute Systems Engineer ($250K - $300K + Equity) at Blaxel
- Cybersecurity Engineer - Cloud, IaC & Automation
- Expert Cloud Engineer
- Infrastructure Engineer
- Junior IT Operations Engineer - Reliability & Automation
- Senior Infrastructure Engineer, Scalable Systems
- Sr Data Engineer : Kansas City
- Remote CloudOps Engineer - 3rd Shift (AWS/Kubernetes)
- Manager, Data Services (Database Reliability Engineering)
- Manager, Data Services (Database Reliability Engineering)
- Global Platform Reliability LeadOttawa, ILMarch 29th, 2026
- Staff Software Engineer - Cloud Storage & Security
- Senior Java/AWS Tech Lead - Cloud & Microservices
- Computer Support Analyst II (Information Systems Analyst II, Option C)
- Entry-level TriZetto Infrastructure Support Analyst
- Site Reliability Engineer - Video Infrastructure
- Network Engineer 3
- Senior Site Reliability Engineer - Data Infrastructure (San Jose)
- Skill Azure Open AI | AWS Bedrock
- Senior Azure Administrator || Franklin Lakes, NJ Onsite || Contract
- Infrastructure Delivery Install Technician, Infrastructure Delivery
- Network Install Technician, Infrastructure Delivery
- Network Install Technician, Infrastructure Delivery
- ADC Engineer I, Amazon Dedicated Cloud Engineering - Support Engineering
- Senior Engineering Manager, Apple Data Platform
- ECAD Application Engineer, Amazon Leo
- Engineering Program Manager, Data Platform, Apple Services Engineering
- Senior Site Reliability Engineer
- Network Engineer 3
- Senior Engineer (Infrastructure) - Fluency in Japanese is a plus
- Site Reliability Engineer - Data Infrastructure (Seattle)
- Engineer ISP Operations
- Infrastructure Engineer
- Log Data Engineer (SIEM)
- Splunk Administrator Level 3
- Computer Specialist Senior Tier II