Data Engineer with Pyspark
Title: Data Engineer (PySpark)Location: Rocky Hill, CT (Onsite)Job SummaryWe are seeking a highly skilled Data Engineer with strong experience in PySpark, Databricks, and Apache Spark to design, build, and optimize scalable data pipelines. The ideal candidate will have a solid background in big data processing, cloud platforms, and distributed systems, with a focus on delivering high-quality, reliable data solutions.Key ResponsibilitiesDesign, develop, and maintain scalable data pipelines using PySpark and Apache SparkBuild and optimize ETL/ELT workflows on DatabricksCollaborate with data scientists, analysts, and stakeholders to understand data requirementsEnsure data quality, integrity, and governance across data platformsOptimize performance of Spark jobs and large-scale data processing systemsWork with structured and unstructured data from multiple sourcesImplement data storage solutions using cloud platforms (AWS, Azure, or GCP)Monitor, troubleshoot, and resolve data pipeline issuesMaintain documentation for data architecture and workflowsRequired QualificationsBachelor’s degree in Computer Science, Engineering, or related fieldStrong hands-on experience with PySpark and Apache SparkProven experience working with DatabricksProficiency in Python and SQLExperience with distributed data processing and big data technologiesFamiliarity with cloud platforms (AWS, Azure, or GCP)Experience with data warehousing solutions (e.g., Snowflake, Redshift, BigQuery)Strong problem-solving and analytical skillsInfowave Systems is an equal opportunity employer that is committed to diversity and inclusion in the workplace.