JOBSEARCHER

DevOps / Observability Engineer

ARCHIVED
IncedoDallas, TXJune 3rd, 2026

We can't find an active application page for this role right now. It may reopen or be listed elsewhere. Use Next Steps to search for an active apply link and similar live jobs.

About Incedo:Incedo is a global AI and data transformation specialist empowering companies to realize sustainable business impact from their digital investments by delivering ROI from AI@Scale. As a long-term partner for strategy to execution, we operate at the intersection of business and technology. Our integrated services and platforms are built on the foundation of AI & Data, digital engineering, and operations transformation, bringing deep domain expertise and full stack capabilities together. With over 4,000 people in the US, Canada, Latin America and India and a large, diverse portfolio of Fortune 500 enterprises and fast growing clients worldwide, we work across banking & payments, wealth management, telecom, hitech and life sciences.Please visit the link to know about Incedo: https://www.incedoinc.com/DevOps / Observability Engineer (Mid-Level)Dallas, TX/Tampa, FL/Basking Ridge, NJPosition OverviewWe are seeking a skilled and motivated DevOps / Observability Engineer with 4 to 6 years of experience to join our infrastructure team. In this role, you will focus heavily on system monitoring, metrics collection, and cluster orchestration. You will play a critical part in maintaining, optimizing, and scaling our cloud infrastructure and ensuring deep visibility into our distributed applications.Key ResponsibilitiesObservability & Dashboards: Design, implement, and maintain scalable dashboards using Grafana to provide real-time visibility into system health and performance.Metrics Management: Deploy and manage advanced metrics and TSDB solutions such as GEM (Grafana Enterprise Metrics), VictoriaMetrics, Thanos, or Mimir to ensure efficient data retention and querying.Container Orchestration: Manage and optimize containerized workloads across Kubernetes clusters, ensuring high availability and fault tolerance.Cloud Infrastructure: Maintain and scale infrastructure within a Cloud environment (AWS/Azure/GCP), ensuring security, cost efficiency, and performance.Automation & Scripting: Reduce toil and automate routine infrastructure tasks, deployment pipelines, and custom metrics collection using scripting languages.Required Technical Skills & ExperienceOverall Experience: 2 to 4 years of professional experience in a DevOps, SRE, or Systems Engineering role.Grafana: 2–4 years of hands-on experience building complex alerts, dashboards, and plugins.Kubernetes: 2–4 years of experience managing, troubleshooting, and deploying applications inside K8s.Metrics Stack: 2–4 years of experience working with at least one major distributed metrics engine: GEM, VictoriaMetrics, Thanos, or Mimir.Cloud Platforms: 2–4 years of production experience in a cloud environment (e.g., AWS, GCP, or Azure).Automation/Scripting: Proficiency in scripting (e.g., Python, Bash, or Go) for automation and writing Infrastructure as Code (IaC).Preferred Soft SkillsStrong problem-solving skills and a proactive approach to identifying system bottlenecks.Excellent communication and collaboration skills to work effectively with cross-functional development teams.Ability to thrive in a fast-paced environment and adapt to evolving technologies.