JOBSEARCHER

SRE Engineer

Responsibilities: You must Troubleshoot the incidents, conduct blameless post-mortems and ensure the permanent closure of incidents. You have to Engage with development team throughout the life cycle to help develop software for reliability. You must Apply analytics on historic data, such as incidents and usage patterns, to predict issues and take proactive action. You should Drive the adoption of self-healing and resiliency patterns such as circuit breaker, bulkhead etc. You have to Design and conduct the performance tests, identifies bottlenecks and opportunities for optimization. You have to Define and drive the adoption of best in class monitoring frameworks to accomplish end to end flow monitoring and noiseless alerting. You must Design, develop, test and deliver software to automate manual operational work You should Deploy the software and product upgrades. You must Facilitate the maximum speed of delivery by objectively binding to error budgets of the service. You should Manage the effort split between manual operational work and engineering work. You must Coach other team members and manage teams as needed. Required Skills: You must have excellent debugging and troubleshooting skills. You must Expert in performance monitoring and capacity management of large systems using various tools. You must Expert in at least one technology stack (Java/J2EE/Python) with designing, coding, testing, and delivering software. You must Expert in at least one of the relational databases (SQL Server, Oracle, DB2 etc.). You must need Hands-on experience with cloud technologies (Cloud Foundry, Kubernetes, AWS). You must need Hands-on experience with big data services (Hadoop, HDFS, Hive, Yarn, HBase, Kafka, Zookeeper). You should have Working knowledge of Groovy, batch scripting, PowerShell or shell scripting. You must have Experience developing, deploying and debugging distributed systems in a Linux, Hadoop environment. You must have Experience with monitoring tools such as AppD, Splunk, ELK, Geneos. You should perform Analysis of SLI metrics and performance data. Interpreting and correlating it to SLOs and SLAs. You must have Experience with deployment automation, CI/CD, DevOps, Jenkins, GIT, BitBucket. You must have Experience with cloud/container environments, big data, analytical tools (Tableau, Alteryx). You must Expert practitioner in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm. You must have Working knowledge of infrastructure components like routers, load balancers and networks. You need to Comfortable working in Agile mode and proficient in continuous integration and continuous delivery.\ You must have Solid understanding of micro-service design methodologies. Attention to detail and time-management skills. Job Type: Contract Pay: $70.00 per hour Schedule: 8 hour shift Education: Bachelor's (Required) Experience: debugging and troubleshooting: 9 years (Required) monitoring and capacity management: 9 years (Required) Java/J2EE/Python: 9 years (Required) SQL Server, Oracle, DB2: 9 years (Required) Cloud Foundry, Kubernetes, AWS: 9 years (Required) Hadoop, HDFS, Hive, Yarn, HBase, Kafka, Zookeepe: 9 years (Required) Groovy, batch scripting, PowerShell or shell scripting: 9 years (Required) Linux, Hadoop environment.: 9 years (Required) AppD, Splunk, ELK, Geneos: 9 years (Required) SLI metrics: 9 years (Required) SLOs and SLAs: 9 years (Required) automation, CI/CD, DevOps, Jenkins, GIT, BitBucket: 9 years (Required) Tableau, Alteryx: 9 years (Required) routers, load balancers and networks: 9 years (Required) Work Location: In person