Analytics and Reporting - Senior Analytics Engineer
Data Engineer Position location can be anywhere near a Client location within NJ. Any site can be utilized and 50% on-site requirement.Top Skills:Advanced SQL skills (5+ years)2+ years experience working with dbt5+ years working with relational databasesMS in Computer Science, Chemical Engineering, Biostatistics or similar with 6 years industry experience or PhD in Computer Science, Chemical Engineering, Biostatistics or similar with 3 years industry experienceIntermediate python skillsIntermediate visualization (tableau, dashboarding) experienceResponsibilities:Performs data engineering, preprocessing, exploratory data analysis, and model development by interacting with a variety of databasesResponsible for ingestion, integration and delivery of data across multiple platformsWorks to maintain and uphold data integrity and clean data principlesResponsible for leading team code review and improving team programming practicesResponsible for independently coordinating and managing analytics projects across several departments and with cross functional stakeholdersAbility to work on a global team and communicate across several time zonesCommunicates with team members regularly to provide updates and collaborate on deliverables.Accountable for leading, documenting and managing analytics URS and UAT through execution for GPODesign and deliver digital solutions that streamline access to analytics & dataWork with domain SMEs to derive insight and value to improve manufacturing related data transformations and improvement initiatives.Displays a high level of teamwork and collaboration both within and across functionsUtilizes supervised or unsupervised methods, learning from vast amounts of unlabeled data to drive insightExperience working with unstructured textEnsures life cycle management of code is maintained through version control and associated repositories.Develops high quality analytical and statistical models, insights, patterns, visualizations, that can be used to improve decision making in manufacturing operations.Responsible for documentation of all technical work both within and outside of formal document management systemsIndependently develops code and analytical models to automate data transformation and analysisRequirements:MS in Computer Science, Chemical Engineering, Biostatistics or similar with 3-6 years industry experience or PhD in Computer Science, Chemical Engineering, Biostatistics or similar with 3 years industry experienceDashboard development experience (Tableau, Spotfire, DASH)Proficient in writing and developing analytical and machine learning models using python modules including pandas, numpy, scikitlearn, and tensorflow.Experiencing developing and implementing MLOps pipelines.Experience building analytical and statistical models to answer key business questionsExperience using git via the command lineStrong understanding of core statistical concepts to solve real world problemsIntermediate to advanced proficiency (3+ years post academia experience as an independent contributor designing and delivering data solutions) in SQL.Experience interacting with various data warehouses and large-scale, complex datasets using ETL and BI tools and platforms.Self-motivated to identify and propose Client methodologies that will drive increased efficiencyDemonstrate expert knowledge in machine learning and rule-based systems as applied to computational linguistics and natural language processing, as well as development and execution of annotation tasks with teams of expertsProficiency in mathematics with the skill to translate complex mathematical algorithms into usable computational methodsExperience with data mining and analysis techniques across disparate data sourcesExperience working in LINUX/UNIX environmentsExperience interacting with PostgresSQL, Oracle, Impala Cloudera, Okera or similar databasesExperience with JupyterLabs, Anaconda, and RStudioIntermediate proficiency with pythonExperience developing visualizations using a variety of methods (plotly, matplotlib, seaborn)Experience working within Domino Data Lab projectsTechnical knowledge of performance tuning and query optimization across large data sets.Experience with data cataloguing and enablement through APIsExperience with a variety of computer science languages (C++, Java, html/css)Exposure to bioprocess engineering/cell therapy dataKnowledge of GxP requirements (preferably related to data and code management)Experience with Program/Project Management. SCRUM experience highly desiredPreferred:Familiar with NET/SAPKnowledge of deep learning methods for NLP (quantitative area of study, Computer Science, preferred)Strong background and demonstratable experience in Natural Language Processing and Computational Linguistics is requiredExperience working with the pharmaceutical industryExperience working with ERP systems