JOBSEARCHER

Lead Analytics Platform Engineer (Enterprise Infrastructure) -

OnsiteUSC and GCHartford, CTJD:Position Summary: The Technical Engineer Lead serves as the senior technical consultant and guide for the Advanced Analytics Engineering team within Enterprise Technology Solutions (ETS) at Travelers Insurance. This role is responsible for the management, upgrade, and support of the enterprise analytical platform serving 500+ analytics users across all Travelers lines of business. The position combines deep technical expertise in multi-platform analytics infrastructure with team leadership, business stakeholder engagement, and strategic technology planning. The role also encompasses organizational responsibilities including Disaster Recovery (DR) coordination, production pipeline signoff governance, and AI tool advocacy to modernize team workflows.Core Platform Responsibilities:1. Miniforge Python "Condaplus" Linux, macOS, Windows, and AWSo Manage and maintain the Condaplus Python distribution across all enterprise platforms. This includes version upgrades and rollouts (desktop, Linux server, Windows server, macOS), coordinating security vulnerability remediation across devices, managing Conda virtual environments for application deployments, integrating with Nexus IQ firewall for automated security scanning of packages, and supporting AWS-based analytics environments including EC2 AMIs. Condaplus serves as the foundation for Python-based analytics, data science workflows, and internal tool development across the organization.2. SAS Linux and Windows Across 9 Physical Serverso Oversee the SAS analytics platform including SAS 9.4 on RHEL 8 Linux servers and SAS Enterprise Guide on Windows. Responsibilities include planning and executing major SAS version upgrades (e.g., SAS 9.4M8, SAS Enterprise Guide 8.5), managing SAS/ACCESS client connectivity to enterprise databases (Teradata, DB2, Oracle) via ODBC configurations, coordinating user migration communications and timelines, troubleshooting SAS server performance and connectivity issues, and managing SAS web services and IIS integration for business-facing applications. This includes maintaining SAS environments across multiple lines of business including Claims, BI, and PI.3. R Tools Windows and 5 Physical Serverso Manage the R analytics ecosystem including RStudio Server (transitioned from Commercial to Opensource, saving $66K+ annually), R version upgrades across platforms (Linux, Windows Desktop, AWS AMIs), Rtools for Windows package compilation, CRAN repository management and Nexus proxy configuration, and GitLab Copilot integration with RStudio Server for version control workflows. Ensure R environments are consistent and accessible for the analytics user community.Software Products & Technologies Managed:The following is a comprehensive inventory of the software products, platforms, and technologies that this role is responsible for managing, supporting, and maintaining across the enterprise.Analytics Platformso SAS 9.4 (M8) Enterprise analytics platform on RHEL 8 Linux (Claims, BI, PI environments)o SAS Enterprise Guide 8.5 Windows desktop analytics client for 500+ userso SAS/ACCESS Clients Database connectivity modules for Teradata, DB2, Oracle, and other data sourceso SAS Web Services / SAS Mid-Tier Web-based SAS application delivery via IIS integrationo SAS Stored Processes Server-side SAS programs invoked via web or application interfaceso RStudio Server (Opensource) R development environment on Linux serverso R (multiple versions: 3.6.3, 4.4.3) Statistical computing across Linux, Windows Desktop, and AWS AMIso Rtools Windows-based R package compilation toolchaino Condaplus (Miniforge Python) Custom Python distribution on Linux, Windows, macOS, and AWSo IBM SPSS Statistical analysis software for business analytics userso PyCharm Professional Python IDE for analytics developmento KNIME Analytics Platform (Desktop) Visual data science and analytics workflow toolInfrastructure & Operating Systemso RHEL 8 (Red Hat Enterprise Linux) Primary Linux platform across 14+ serverso Ubuntu 22.04 AWS EC2 instances and GPU AMI environmentso Windows Server SAS and R platform hosting, IIS web serviceso GPFS (General Parallel File System) High-performance shared storage (/gpfs2/PI_SharedData)o IIS (Internet Information Services) SAS web application hosting on Windows serversAWS Cloud Serviceso AWS EC2 Elastic compute instances for analytics workloads and ML environmentso AWS S3 Object storage for data lakes with security governanceo AWS Athena Serverless SQL query serviceo AWS AMI Management Custom analytics AMIs (Cloudex) with R, Condaplus, encrypted EBSo AWS Tagging Compliance ICS-V2 Cloud Tagging Standard enforcement (AppId, SystemNumber, Owner, CapabilityId)o Amazon Bedrock Generative AI service (enabled for UConn project)o Terraform Enterprise (TFE) Infrastructure as Code for AWS resource provisioningo Predictive Modeling Competition Infrastructure Annual competition environment: 33 EC2 instances, dedicated GitLab server, S3 security governance (31 teams, 100+ participants)Authentication & Securityo Quest Authentication Services (VAS) Active Directory integration for Linux/SAS via PAMo SSSD (System Security Services Daemon) Open-source AD integration (Ubuntu/RHEL)o Kerberos Authentication protocol for AD-integrated Linux environmentso PAM (Pluggable Authentication Modules) Linux authentication framework for SAS, SSH, and application accesso Nexus IQ / Nexus Repository Package security scanning and repository proxy for Python, R, and Conda packageso IIQ (IdentityIQ) REST API Automated EC2 access provisioning and AD group registrationo SSL/TLS Certificate Management Server certificate lifecycle managementMonitoring, Alerting & Operationso RAMMON (Resource and Memory Monitoring) Custom web-based server monitoring dashboard for 14+ serverso Connect Direct File transfer monitoring for RHEL 8 production job protectiono OOM (Out of Memory) Protection Scripts Automated memory management and process governanceo Custom Email Alerting System Proactive incident notification via smtpxfer.prodlb.travp.neto Server Load Analysis Tools CPU/RAM heatmap analysis for capacity planningo Filesystem Health Check Scripts Automated monitoring of GPFS and local filesystemsInternal Web Applications (Flask/Python)o AA Engineering Portal Central SSO-authenticated hub for all team applicationso Team Accomplishments Tracker Monthly/quarterly achievement tracking with email reports and Excel import/exporto Documentation Inventory Searchable documentation reference system (migrated from SharePoint Excel)o Server Monitoring Dashboard (RAMMON) Real-time RAM, CPU, and process monitoring across all serversDevelopment & Version Controlo Git / GitHub Enterprise Version control and repository management for team code and configurationso GitHub Copilot AI-assisted code development integrated with RStudio Server and development workflowso Nginx Reverse proxy and web server for Flask application routing and SSL terminationo Gunicorn Python WSGI HTTP server for production Flask deploymentso systemd Linux service management for all deployed applicationsProject Management & Collaborationo ServiceNow Incident management, user stories, and Scrumban workflow adoptiono SharePoint Documentation hosting and team collaborationo Agile/Scrumban Sprint planning using ServiceNow Strategic Planning WorkspaceKey Responsibilities:Technical Leadership & Team Guidanceo Serve as the senior technical consultant to the Advanced Analytics Engineering team, providing guidance on architecture decisions, troubleshooting complex infrastructure issues, and setting technical direction for platform evolution. Consult with and mentor team members on best practices for server administration, deployment automation, and cross-platform integration. Responsible for the overall technical output and delivery of the engineering team.Infrastructure Monitoring & Reliabilityo Design and maintain comprehensive server monitoring and alerting solutions across 14+ Linux servers supporting analytics workloads. This includes RAM and CPU usage monitoring dashboards (RAMMON), automated memory management and OOM protection scripts, filesystem health checks, proactive email alerting (which reduced incidents/outages by 90%), and server load analysis for capacity planning conversations with lines of business.Disaster Recovery Coordinationo Serve as the DR Coordinator for the Advanced Analytics organization. Ensure business continuity through proactive monitoring, documented recovery procedures, and regular DR testing. Coordinate with infrastructure teams and business stakeholders to maintain recovery readiness across all analytics platforms.Production Pipeline GovernanceServe as the pipeline signoff individual for the organization, maintaining rigorous testing and validation protocols for all production deployments. Ensure zero-impact rollouts of database drivers (e.g., Teradata Vantage), platform upgrades (e.g., SAS 9.4M8, SAS EG 8.5), and security patches (e.g., Condaplus 3.0 across 93 devices) through structured testing, communication plans, and staged deployment strategies