Site Reliability Engineer
Job Title: Site Reliability Engineer Lead Location: Atlanta, GA (Onsite)Duration: 12+ Months Job PurposeThis position is for a hands-on SRE lead, focused on providing resilient, secure, scalable and supportable services for heterogenous set of application across on-premise and cloud catering to stores systems. You will contribute to the strategy and delivery of the team, as well as managing the day-to-day workload. This role requires building a close relationship with our development teams, operations, engineering, database and product organizations.You will be involved in the design of resilient systems, the definition and monitoring of SLI/SLO’s/BLA’s creating pro-active actionable alerts, and also drive production incidents. ResponsibilitiesProvide thought-leadership; set the technical direction for the SRE and overall development TeamDefine and manage projects to meet Team objectives.Set individual goals and manage personal growth of team members.Oversee and guide development to implement SRE strategies for diverse set of SaaS Applications and internal services.Serve as the face of a team responsible for the overall health, performance, and capacity of our business applicationsDevelop sustainable SRE practices around simplification and standardizationDrive of the cultural standard for SRE including defining ways of working, runbooks and accountability across people, processes, and technology.Partner with other SRE and development teams and lead by example.Knowledge And Experience10+ years of Application/Systems engineering in 24x7 Production Services environmentsBS in Computer Science, Computer Engineering, Math, or equivalent professional experienceExperience in designing, deploying and operating SaaS applications and cloud infrastructure (GCP or equivalent & On-Premise virtualized environments)Excellent troubleshooter spanning systems, networks and code , utilizing a systematic problem-solving approachDemonstrate the ability to lead diverse SRE and development teams.Fluency with one or more current generation scripting language used by SRE/DevOps professionals.Proficiency in monitoring and performance tools like Dynatrace, Splunk, Google Analytics, ELK.Should have experience on google cloud to implement SRE strategies.Strong communication skills