Databricks Engineer
Build What Matters As a Databricks Engineer At RevStar.RevStar is a Databricks Partner launching a cloud-agnostic practice focused on Data, Machine Learning (ML), and Artificial Intelligence (AI) services. Our mission is to help businesses modernize their data platforms, optimize analytics workflows, and implement scalable AI-driven solutions using Databricks.We are passionate about what we build and how we build it. From architecture and design to coding and delivery, we approach each project with an agile mindset, continuously analyzing goals and business needs to ensure optimal outcomes.At RevStar, we foster a collaborative, remote-first culture where teams freely share ideas, innovate together, and grow both individually and collectively. By joining us, you'll have the opportunity to work with cutting-edge technologies across diverse industries, delivering value-driven products for clients who prioritize quality and performance. We believe in the pursuit of better, not just in cloud-native app development, but in creating meaningful experiences and outcomes that matter.We are seeking a highly skilled Databricks Engineer to join our team. This role will be hands-on, working closely with architects, data scientists, and customers to build, optimize, and deploy high-performance data and AI solutions.As a Databricks Engineer, you will be responsible for building and optimizing data pipelines, implementing data processing frameworks, and enabling AI/ML solutions within Databricks. You will work across data ingestion, transformation, and orchestration while ensuring scalability, performance, and security.This is a technical hands-on role, requiring expertise in Apache Spark, Delta Lake, and MLOps, as well as experience working with large-scale data architectures. You will collaborate with architects and business stakeholders to ensure solutions align with customer needs and best practices.Above all, the ideal candidate embodies RevStar's core values:Self-Mastery: We hold a high bar for how we think, communicate, and improveOwnership: We own outcomes, not just effortShared Destiny: We rise or fall togetherKey ResponsibilitiesData Engineering & Pipeline DevelopmentDevelop and optimize data pipelines using Apache Spark and Delta Lake within DatabricksImplement ETL/ELT workflows, ensuring efficient data ingestion, transformation, and storageDesign Lakehouse architecture-based solutions that scale across structured and unstructured data sourcesIntegrate Databricks with cloud storage solutions (Azure Data Lake, AWS S3, Google Cloud Storage) for seamless data managementPerformance Optimization & AutomationOptimize Spark jobs for scalability, cost efficiency, and low latencyImplement monitoring and alerting solutions to track job performance and detect failuresDevelop automated data validation, testing, and quality assurance processesManagement AI/ML Integration & MLOps SupportSupport ML model training and deployment within Databricks, integrating with MLflow for experiment tracking and model versioningCollaborate with data scientists and ML engineers to enable scalable AI solutionsImplement feature engineering pipelines and integrate models into production environmentsSecurity, Governance & Best PracticesEnsure data security, access control, and compliance with industry standards (GDPR, HIPAA, SOC 2, etc.)Follow Databricks best practices for data lineage, governance, and metadata managementDocument processes, configurations, and best practices for internal and client useRequirementsMust-Have:3+ years of hands-on experience in data engineering, with a focus on big data processing and cloud-native architectures2+ years of hands-on experience with Databricks, including Apache Spark, Delta Lake, and MLflowDatabricks Certifications (Mandatory):Databricks Certified Data Engineer Associate (or higher)Proficiency in Python, SQL, and Spark-based frameworksExperience in developing and optimizing large-scale ETL/ELT pipelinesStrong understanding of Lakehouse architecture and cloud-agnostic data solutionsFamiliarity with CI/CD pipelines and Infrastructure-as-Code (IaC) for Databricks (e.g., Terraform, Databricks CLI)Knowledge of data governance, security, and compliance best practicesExperience working in Agile development environments, following DevOps/MLOps best practicesNice-to-Have:Additional Databricks Certifications (e.g., Databricks Certified Machine Learning Associate)Experience with real-time streaming solutions (e.g., Kafka, Kinesis, Event Hub)Familiarity with cloud storage and orchestration tools (e.g., Apache Airflow, Prefect)Background in AI/ML integration within Databricks, assisting in feature engineering and model deploymentExperience working in client-facing roles or consulting environmentsBenefitsPaid Time Off - Take the time you need to recharge and stay productiveRemote-First Working Environment - Collaborate from anywhere while staying connected with our global teamComprehensive Health Coverage - Medical, Dental, Vision401(k) Retirement Plan - Plan for your future with access to a company-sponsored 401(k) programAnnual Learning & Development Stipend - Invest in your skills with conferences, certifications, or coursesPeer Mentorship & Coaching - Learn from experienced engineers, product managers, and architects to accelerate your growthProfessional Growth Opportunities - Exposure to cutting-edge AWS GenAI, data, and cloud technologies across diverse industriesCompany Outings & Volunteer Opportunities - Build relationships and give back to the communityCollaborative, Innovative Culture - Work alongside top talent in a fast-paced, supportive environment that values curiosity and initiative