NOC Engineer / NOC Analyst
NOC Engineer / NOC AnalystLocation: Redmond, WA (Onsite) Duration: 24 Months Start Date: May 18, 2026 Shift: 24x7 Rotational (Including weekends & on-call) Shift Timing: 5:00 PM – 5:00 AM PT (M–Sun)Role SummaryWe are seeking a skilled NOC Engineer / NOC Analyst responsible for 24x7 monitoring, incident management, and operational support of a large-scale hybrid IT infrastructure. The role ensures high availability, performance, and reliability across production, disaster recovery, and non-production environments.Key ResponsibilitiesInfrastructure Monitoring & OperationsMonitor 1200+ servers (Windows/Linux), virtualization platforms (VMware, Nutanix), and web serversManage large-scale storage systems (PB-level: Quantum, Isilon, NAS, SAN)Monitor network devices (switches, routers, firewalls, VPNs, WAPs, ISP circuits)Handle alerts, incidents, and service requests across infrastructure toolsIncident & Event ManagementPerform L1/L2 triage for alerts, incidents, and outagesEnsure SLA-based resolution, escalation, and communicationCorrelate alerts across monitoring tools to identify root causesApplication & Service MonitoringMonitor 50+ applications across Prod, DR, UAT, and Dev environmentsTrack application health, availability, and dependenciesCapacity & Performance ManagementMonitor compute, storage, and network utilization trendsIdentify bottlenecks and recommend improvementsChange & Release SupportSupport deployments, patching, and maintenance activitiesValidate system health before and after changesDisaster Recovery & ResilienceSupport DR readiness and failover testingParticipate in DR drills and validationReporting & DocumentationMaintain dashboards, runbooks, and incident documentationProvide daily/weekly health reports and SLA metricsRequired SkillsTechnical SkillsWindows & Linux server administration (L1 / L1.5 troubleshooting)Virtualization: VMware & NutanixStorage: SAN/NAS, Isilon, Quantum (or similar large-scale storage)Networking fundamentals: TCP/IP, DNS, VPN, Firewalls, Load Balancers (F5)Monitoring tools: New Relic, Splunk, Nagios, Zabbix, Dynatrace, SCOMITSM tools: ServiceNow (preferred)Backup tools: RubrikOperational SkillsExperience in 24x7 support environmentsStrong incident management & escalation handlingAnalytical troubleshooting skillsAbility to handle high-pressure outage scenariosStrong communication & coordination skillsDocumentation and reporting expertisePreferred QualificationsITIL Foundation CertificationExperience in enterprise or MSP environmentsExposure to cloud platforms like AWS or AzureDeliverablesProcess documentation and operational flowsKnowledge transfer to client teamsParticipation in project deliverables and data maintenanceProvide best practices and innovative solutionsTechnical support and configuration as required