Site Reliability Engineer
Asset Management Technology is an integral partner of Asset Management and is responsible for delivering innovative, scalable, industry-leading investment tools that enable the business to achieve competitive advantage globally.Members of the Production Services & SRE team regularly work to:Assess the business impact of incidents via customer interactionTroubleshooting and analysis of batch and application issuesResolution of incidents through application tools and databasesCoordinate the migration of fixes with development teams into ProductionProvide On-Call support Week and WeekendsRoleWe are looking for a highly motivated individual to join our Production Services / SRE Team supporting the Fidelity Asset Management Fixed Income division. You will provide level 2 application and batch support in a fast paced, high-energy, and collaborative work environment. This is a unique opportunity to work on a wide variety of technologies in a core Fidelity business. This position involves exposure to the core Fidelity financial business applications.The Expertise You HaveBachelor’s degree or equivalent experience or higher in a technology related field (e.g. Engineering, Computer Science, etc.) or equivalent required, Master’s degree a plus10+ years of hands-on experience deploying and/or supporting highly distributed multi-tiered systems at scale4 + years of experience in Cloud support (AWS) and migration skills; Experience with building and operating highly resilient platforms in AWS cloud environmentsHands-on experience with container orchestration, preferably with KubernetesExperience operating and implementing distributed & highly concurrent service-basedAbility to automate with various scripting languages (Python, Shell scripting, etc.)Experience managing systems using infrastructure as code tools (IAM, ARM, Terraform, Chef)Proven understanding of Cloud Computing and DevOps concepts including CI/CD pipelinesExperienced in Instrumentation with systems skills on building and operating, monitoring, logging, alerting services of distributed systems at scaleProven experience in maintaining scalability and resiliency in a complex environment.Proven experience in implementing advanced observability practices and techniques at scale.Demonstrated ability to apply modern monitoring tools (DataDog, Prometheus, Splunk)Extensive experience with Oracle and PL/SQLAnalyze code (PL/SQL, Perl, Shell script) to troubleshoot and propose long term solutions. Expertise in writing PL/SQL code, Stored Procedures, Functions, Triggers etc.Demonstrate strong analytical and problem-solving skills while working with the team in daily operations including resolution of data quality issues, analysis of system functionality, responding to system alerts, business queries and report requests.Provide prompt support for Trading, Portfolio Management and Research applications.Proven technical background with high degree of comfort with Linux, database systems, networking concepts, and ITSM-based standards.Experience with batch scheduling software, Autosys a plusReview platforms used by the client and suggest improvements in automation, continuous integration/continuous deployment (CI/CD) practices, security, and platform servicesThe Skills You BringYour real passion for technology and solving problems in a fast-paced and dynamic environmentYour ability to lead and provide mentorship to other team membersYou have good social skills and the ability to communicate effectively in both written and verbal formYou learn quickly, have an analytical approach, and enjoy finding ways to continuously improveYour previous experience in financial services