Lead Linux DevOps Systems Engineer
***We are unable to sponsor for this permanent full-time role***
***Position is bonus eligible***
Prestigious Financial Institution is currently seeking a Senior Linux DevOps Systems Engineer with strong DevOps and Python automation experience. Candidate will be responsible for overseeing the design, implementation, and optimization of enterprise-wide Linux server infrastructure with a focus on automation and containerization platforms across on-premises and cloud environments. This role provides technical leadership and strategic direction for Linux systems architecture, Ansible Automation Platform, Red Hat Satellite, OpenShift, and AWS cloud infrastructure while mentoring team members and ensuring high availability, security, and performance across all Linux systems. The position serves as the primary technical authority for complex Linux server challenges and drives innovation in infrastructure automation, cloud-native development, hybrid cloud integration, and enterprise disaster recovery solutions.
Responsibilities:
Lead the design, deployment, and maintenance of enterprise Linux server environments (RHEL, CentOS, Ubuntu, SUSE, Amazon Linux) with hands-on configuration and troubleshooting across on-premises and AWS cloud infrastructure
Plan, execute, and manage enterprise-wide Linux patching strategies including security patches, kernel updates, and critical vulnerability remediation across thousands of servers
Develop and maintain comprehensive disaster recovery (DR) plans for Linux infrastructure including RPO/RTO targets, failover procedures, and recovery testing schedules
Implement and enforce CIS (Center for Internet Security) benchmarks and security baselines across all Linux systems including automated compliance scanning, remediation, and reporting
Plan, execute, and manage RHEL operating system upgrades across enterprise environments including in-place upgrades (Leapp), migration strategies, and rollback procedures
Develop and implement infrastructure automation strategies using Ansible Automation Platform (AAP) including playbook development, workflow orchestration, and automation controller management
Manage and optimize Red Hat Satellite infrastructure for system provisioning, patch management, and content lifecycle management across the enterprise
Implement and manage automated patching workflows using Red Hat Satellite, Ansible, and AWS Systems Manager for both on-premises and cloud environments
Design, deploy, and manage AWS Linux EC2 instances including instance configuration, auto-scaling, and integration with AWS services
Create, maintain, and manage AMI (Amazon Machine Image) lifecycle including image hardening, patching, golden image development, and automated AMI pipeline creation
Implement AMI versioning strategies, testing procedures, and distribution processes across multiple AWS accounts and regions
Design and implement disaster recovery solutions including backup strategies, replication technologies, failover automation, and multi-region/multi-site architectures
Design and maintain NFS storage solutions and distributed file systems for enterprise applications
Architect, deploy, and manage OpenShift container platforms and Kubernetes environments in hybrid cloud configurations
Implement and support Red Hat Dev Spaces for cloud-native development workflows
Conduct regular DR drills and testing to validate backup and recovery procedures
Develop and maintain security hardening standards based on CIS benchmarks, STIG requirements, and organizational security policies
Manage incidents, requests, and change management processes using ITSM tools such as ServiceNow including ticket resolution, escalations, and SLA compliance
Maintain technical documentation, knowledge base articles, runbooks, and operational procedures in Confluence
Establish and enforce Linux server security standards, hardening procedures, and compliance protocols across on-premises and cloud environments
Oversee system performance monitoring, capacity planning, and optimization initiatives across all platforms
Provide escalation support for complex technical issues and lead incident response efforts
Collaborate with cross-functional teams including networking, storage, security, and application development
Drive continuous improvement initiatives and evaluate emerging Red Hat, AWS, and cloud-native technologies
Create and maintain comprehensive technical documentation, runbooks, and standard operating procedures
Participate in on-call rotation and provide 24/7 support for critical systems as needed
Lead vendor management activities and coordinate with Red Hat and AWS support
Provide technical mentorship and guidance to Linux administrators and junior team members
Lead technical training sessions and knowledge transfer initiatives on Ansible, Satellite, OpenShift, AWS, patching, and DR procedures
Qualifications:
10+ years of progressive hands-on experience in Linux/Unix system administration
5+ years in a technical leadership or senior engineering role
Strong hands-on experience with Ansible Automation Platform (AAP) including automation controller, execution environments, and workflow development
Proven expertise in Red Hat Satellite for system lifecycle management and content management
Extensive experience planning and executing enterprise-scale Linux patching programs including change management, patch testing, and emergency patching procedures
Demonstrated experience designing and implementing disaster recovery solutions for Linux infrastructure including backup/restore, replication, and failover strategies
Demonstrated experience planning and executing RHEL OS upgrades across major versions (e.g., RHEL 7 to 8, RHEL 8 to 9) using Leapp and other upgrade methodologies
Extensive hands-on experience with AWS Linux EC2 instances, including Amazon Linux and RHEL on AWS
Demonstrated experience in AMI creation, customization, hardening, and lifecycle management
Proven track record of building automated AMI pipelines using tools such as Packer, Ansible, or AWS Image Builder
Demonstrated experience with AWS cloud services and hybrid cloud architectures
Extensive hands-on experience with OpenShift container platform and Kubernetes orchestration
Demonstrated experience implementing and managing NFS and distributed storage solutions
Working knowledge of Red Hat Dev Spaces for development environment provisioning
Proven track record of designing and implementing large-scale automated Linux infrastructure in hybrid environments
Strong understanding of DevOps principles and CI/CD methodologies
Excellent problem-solving abilities and analytical thinking skills
Outstanding communication skills with ability to explain technical concepts to non-technical stakeholders
Strong project management capabilities and ability to manage multiple priorities
Red Hat certifications (RHCE, RHCA) and/or AWS certifications (Solutions Architect, SysOps Administrator) highly preferred
INDC
Job Types: Full-time, Permanent
Benefits:
401(k)
Dental insurance
Health insurance
Paid time off
Vision insurance
Work Location: Hybrid remote in chicago, IL 60605