Site Reliability Engineer
Date ActiveMay 31, 2022 12:40:25 PM
Hours Per Week40
Location436 Slater Road-HF308
Job Description/ Requirements
If you’re looking for a meaningful career, you’ll find it here at Webster. Founded in 1935, our focus has always been to put people first--doing whatever we can to help individuals, families, businesses and our colleagues achieve their financial goals. As a leading commercial bank, we remain passionate about serving our clients and supporting our communities. Integrity, Collaboration, Accountability, Agility, Respect, Excellence are Webster’s values, these set us apart as a bank and as an employer.
Come join our team where you can expand your career potential, benefit from our robust development opportunities, and enjoy meaningful work!
The SRE is a key Technical role in Webster Software Engineering organization. The primary focus of SRE is what ultimately matters to Webster internal and external customers: making sure the platforms and services customers rely on are highly reliable and available when customers need to use them. The SRE is expected to run the production environment by monitoring availability and taking a holistic view of system health. The SRE will combine engineering experience and the drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. The SRE will bring fresh ideas, demonstrate a unique and informed viewpoint, and enjoy collaborating with a cross-functional team to develop real-world solutions and positive user experience. The SRE will also lead the technical efforts involved in moving / refactoring Webster in-house applications to a Cloud infrastructure.
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.
- Set a high bar for reliability and availability -- and meet the bar via automation relentless improvement
- Balance feature development speed and reliability with well-defined service level objectives.
- As a key Tech resource on Agile teams, help improve services through rigorous development, testing and release procedures.
- Lead the Diagnosis / Root Cause Analysis of Technology Incidents, pursue improvements to ensure incidents / problems do not recur.
- Key player during deliberations on system design, platform management, and capacity planning.
- Have a strong 'detective' mindset on why things don't work -- be among the first the offer and work on solutions.
- Advocate for and lead Automation wherever possible
- Be a 'link' between Technologists and Business --able to have conversations with Line of Business (LOB) and Tech Agile Teams to work through challenges.
- Own the Tech work involved in migrating in-scope Webster business apps to the Cloud.
- Bachelor's degree required.
- 5+ years' experience working in designing, building, and maintaining business applications.
- Experience in setting up SLAs/SLOs/SLIs for critical services and establishing monitoring around them.
- Deep understanding of Operating Systems and Application architecture.
- Experience with multiple programming languages, config management tools and containers strongly preferred.
- Strong knowledge of various Agile methodologies: Scrum, Kanban, XP, etc., plus Scaled Agile (SAFE).
- Hands on experience in cloud migration journey.
- Deep understanding of the systems development life cycle.
- Excellent written and verbal skills.
- Ability to multitask in high paced, pressured environment.