Incident Engineer
Incident Engineer (Harmony / Kafka / CoreDataPipeline)About the JobAs an Incident Engineer supporting one of our largest clients, you will work closely with Incident Managers, global peer Incident Engineers, and client-side senior development engineers and team leads to mitigate production incidents during your shift. You will utilize incident runbooks, existing documentation, independent research, investigation, troubleshooting, testing, and prior experience to restore services efficiently. This role provides exposure to modern data processing and streaming technologies, including Kafka-based systems, while continuously expanding technical and operational expertise.ResponsibilitiesAct as a first-tier incident responder by promptly acknowledging alerts, analyzing incidents, and independently identifying accurate mitigation solutionsInvestigate incidents using runbooks, knowledge bases, documentation, and self-driven researchIdentify root causes, clearly document findings, and capture actionable learning points for team sharingCommunicate proactively with Incident Managers and client-side senior engineers or team leadsEscalate incidents as needed with thorough technical context and analysisWork independently on incidents while maintaining strong attention to detail and accuracyEffectively multitask across incidents, operational tickets, and project work in a fast-paced environmentRecognize recurring patterns and connect insights across multiple incidentsContinuously expand technical knowledge through documentation, runbooks, and online resourcesPrepare and present demos or knowledge-sharing sessions based on learningsPerform routine operational tasks such as cluster or machine maintenance and access or permission managementSkills SummaryIncident Response & Production SupportKafka & Streaming PlatformsDatabricksDistributed Systems TroubleshootingCloud Platforms (AWS)Linux & Command-Line EnvironmentsScripting & Automation (Bash, Python, JSON)Workflow Orchestration (Airflow)Containerization & KubernetesMonitoring & Observability ToolsTechnical DocumentationCross-Functional CommunicationAttention to Detail & MultitaskingMinimum QualificationsAssociate degree in Computer Science, Information Systema related technical field, or equivalent practical experience2 years of experience in technical support, operations, QA, software development, or related technical rolesStrong analytical, troubleshooting, and communication skillsAbility to learn new technologies quickly and work independentlyQuick learner, self-sufficient in learning a broad array of topics in a short amount of time based on video trainings or documentationsInterested in diving into the technical fields utilizing extensive troubleshooting skillsUplifting, empathetic and a marvelous team playerGrowth-mindset and life-long learnerPreferred QualificationsBachelor and above degree in a technical field or additional relevant practical experienceHands-on experience supporting or troubleshooting distributed or streaming systemsFamiliarity with Kafka or similar messaging platformsAbility to work with multiple differing priorities in a fast-paced, constantly changingEnvironmentAbility to work effectively in a fast-paced, constantly changing environmentExcellent communications, logical thinking and analytical skillsMultitasking acumen, resilient, and able to handle pressuresNice-to-Have QualificationsExperience with data streaming and processing technologies such as Kafka and ZooKeeperExposure to AWS infrastructure and cloud servicesExperience with Databricks or other data warehousing platformsFamiliarity with Linux environments and command-line toolsExperience with scripting languages such as Bash, Python, or JSONExposure to Airflow, Kubernetes, containers, or similar orchestration technologiesExperience with monitoring, logging, or collaboration tools such as GitHub, Datadog, Grafana, Jira, or Confluence