{"schemaVersion":"jobsearcher.job.v1","id":"eff3363bb6c005cf6b815c52","url":"https://jobsearcher.com/jobs/eff3363bb6c005cf6b815c52","canonicalUrl":"https://jobsearcher.com/jobs/eff3363bb6c005cf6b815c52","title":"DevOps / Observability Engineer","description":"About Incedo:Incedo is a global AI and data transformation specialist empowering companies to realize sustainable business impact from their digital investments by delivering ROI from AI@Scale. As a long-term partner for strategy to execution, we operate at the intersection of business and technology. Our integrated services and platforms are built on the foundation of AI & Data, digital engineering, and operations transformation, bringing deep domain expertise and full stack capabilities together. With over 4,000 people in the US, Canada, Latin America and India and a large, diverse portfolio of Fortune 500 enterprises and fast growing clients worldwide, we work across banking & payments, wealth management, telecom, hitech and life sciences.Please visit the link to know about Incedo: https://www.incedoinc.com/DevOps / Observability Engineer (Mid-Level)Dallas, TX/Tampa, FL/Basking Ridge, NJPosition OverviewWe are seeking a skilled and motivated DevOps / Observability Engineer with 4 to 6 years of experience to join our infrastructure team. In this role, you will focus heavily on system monitoring, metrics collection, and cluster orchestration. You will play a critical part in maintaining, optimizing, and scaling our cloud infrastructure and ensuring deep visibility into our distributed applications.Key ResponsibilitiesObservability & Dashboards: Design, implement, and maintain scalable dashboards using Grafana to provide real-time visibility into system health and performance.Metrics Management: Deploy and manage advanced metrics and TSDB solutions such as GEM (Grafana Enterprise Metrics), VictoriaMetrics, Thanos, or Mimir to ensure efficient data retention and querying.Container Orchestration: Manage and optimize containerized workloads across Kubernetes clusters, ensuring high availability and fault tolerance.Cloud Infrastructure: Maintain and scale infrastructure within a Cloud environment (AWS/Azure/GCP), ensuring security, cost efficiency, and performance.Automation & Scripting: Reduce toil and automate routine infrastructure tasks, deployment pipelines, and custom metrics collection using scripting languages.Required Technical Skills & ExperienceOverall Experience: 2 to 4 years of professional experience in a DevOps, SRE, or Systems Engineering role.Grafana: 2–4 years of hands-on experience building complex alerts, dashboards, and plugins.Kubernetes: 2–4 years of experience managing, troubleshooting, and deploying applications inside K8s.Metrics Stack: 2–4 years of experience working with at least one major distributed metrics engine: GEM, VictoriaMetrics, Thanos, or Mimir.Cloud Platforms: 2–4 years of production experience in a cloud environment (e.g., AWS, GCP, or Azure).Automation/Scripting: Proficiency in scripting (e.g., Python, Bash, or Go) for automation and writing Infrastructure as Code (IaC).Preferred Soft SkillsStrong problem-solving skills and a proactive approach to identifying system bottlenecks.Excellent communication and collaboration skills to work effectively with cross-functional development teams.Ability to thrive in a fast-paced environment and adapt to evolving technologies.","company":"Incedo","rawCompany":"incedo","city":"Dallas","state":"TX","isRemote":false,"isActive":false,"createdAt":"2026-06-03T08:45:50.749Z","occupations":[{"code":"15-1299.08","title":"Computer Systems Engineers/Architects","slug":"computer-systems-engineers-architects"},{"code":"15-1252.00","title":"Software Developers","slug":"software-developers"},{"code":"15-1244.00","title":"Network and Computer Systems Administrators","slug":"network-and-computer-systems-administrators"}],"industries":[{"code":"541512","title":"Computer Systems Design Services","slug":"computer-systems-design-services"},{"code":"541511","title":"Custom Computer Programming Services","slug":"custom-computer-programming-services"},{"code":"518210","title":"Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services","slug":"computing-infrastructure-providers-data-processing-web-hosting-and-related-services"}],"jobPosting":{"@context":"https://schema.org","@type":"JobPosting","title":"DevOps / Observability Engineer","description":"About Incedo:Incedo is a global AI and data transformation specialist empowering companies to realize sustainable business impact from their digital investments by delivering ROI from AI@Scale. As a long-term partner for strategy to execution, we operate at the intersection of business and technology. Our integrated services and platforms are built on the foundation of AI & Data, digital engineering, and operations transformation, bringing deep domain expertise and full stack capabilities together. With over 4,000 people in the US, Canada, Latin America and India and a large, diverse portfolio of Fortune 500 enterprises and fast growing clients worldwide, we work across banking & payments, wealth management, telecom, hitech and life sciences.Please visit the link to know about Incedo: https://www.incedoinc.com/DevOps / Observability Engineer (Mid-Level)Dallas, TX/Tampa, FL/Basking Ridge, NJPosition OverviewWe are seeking a skilled and motivated DevOps / Observability Engineer with 4 to 6 years of experience to join our infrastructure team. In this role, you will focus heavily on system monitoring, metrics collection, and cluster orchestration. You will play a critical part in maintaining, optimizing, and scaling our cloud infrastructure and ensuring deep visibility into our distributed applications.Key ResponsibilitiesObservability & Dashboards: Design, implement, and maintain scalable dashboards using Grafana to provide real-time visibility into system health and performance.Metrics Management: Deploy and manage advanced metrics and TSDB solutions such as GEM (Grafana Enterprise Metrics), VictoriaMetrics, Thanos, or Mimir to ensure efficient data retention and querying.Container Orchestration: Manage and optimize containerized workloads across Kubernetes clusters, ensuring high availability and fault tolerance.Cloud Infrastructure: Maintain and scale infrastructure within a Cloud environment (AWS/Azure/GCP), ensuring security, cost efficiency, and performance.Automation & Scripting: Reduce toil and automate routine infrastructure tasks, deployment pipelines, and custom metrics collection using scripting languages.Required Technical Skills & ExperienceOverall Experience: 2 to 4 years of professional experience in a DevOps, SRE, or Systems Engineering role.Grafana: 2–4 years of hands-on experience building complex alerts, dashboards, and plugins.Kubernetes: 2–4 years of experience managing, troubleshooting, and deploying applications inside K8s.Metrics Stack: 2–4 years of experience working with at least one major distributed metrics engine: GEM, VictoriaMetrics, Thanos, or Mimir.Cloud Platforms: 2–4 years of production experience in a cloud environment (e.g., AWS, GCP, or Azure).Automation/Scripting: Proficiency in scripting (e.g., Python, Bash, or Go) for automation and writing Infrastructure as Code (IaC).Preferred Soft SkillsStrong problem-solving skills and a proactive approach to identifying system bottlenecks.Excellent communication and collaboration skills to work effectively with cross-functional development teams.Ability to thrive in a fast-paced environment and adapt to evolving technologies.","datePosted":"2026-06-03T08:45:50.749Z","dateModified":"2026-06-03T08:45:50.749Z","hiringOrganization":{"@type":"Organization","name":"Incedo","sameAs":"https://jobsearcher.com"},"jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Dallas","addressRegion":"TX","addressCountry":"US"}},"identifier":{"@type":"PropertyValue","name":"JobSearcher","value":"eff3363bb6c005cf6b815c52"},"url":"https://jobsearcher.com/jobs/eff3363bb6c005cf6b815c52"}}