Lead Site Reliability Engineer - Greater London, United Kingdom - Apollo Solutions

    Default job background
    Description

    Job Description

    Lead Cloud Site Reliability Platform Engineer

    London

    Hybrid - 2 days per week onsite

    Salary: Up to £120k

    Excellent Benefits and 20% Bonus

    My client Global Financial Client is looking for a Lead Cloud Site Reliability Platform Engineer to join their team to focus on keeping their services running, while simultaneously supporting programme timescales and business outcomes. This will be a Hybrid working model.

    Lead Cloud Site Reliability Platform Engineer Responsibilities:

    • Leading the L1/L2 team to continually improve the cycle time and efficiency of incident & service request resolution, blameless post-mortems, and problem records.
    • Leading the team to ensure service tickets and incidents are resolved within SLA and effectively passed on to product teams, where L3/L4 support is required.
    • Driving several cloud compliance framework controls such as Annual DR and recovery testing, capacity management, etc.
    • Continually improve the percentage of service tickets and incidents resolved by the team and not escalated to another team.
    • Identifying top reasons for service requests and incidents and addressing the root cause thereby reducing the number of tickets quarter by quarter.
    • Provide thought leadership in operational areas such as change and release management, capacity management, backup and recovery etc.
    • Ensuring the team is correctly skilled for the roles and identifying candidates to transition from Ops roles to SRE

    Must-Haves:

    • Solid understanding of the SRE role and principles
    • Experience working with a wide range of products in Azure and GCP, Kubernetes, container registries, networking, etc.
    • Experience working with several CI/CD and infrastructure as code-related tools such as Terraform, GitHub, Azure DevOps, Jenkins, Chef, etc.
    • Experience leading an SRE or Operations team
    • Negotiating skills to influence technical and leadership decisions to achieve the right consumer outcomes and operational needs
    • A good understanding of public cloud security
    • Experience developing teams in a large, complex, highly regulated industry
    • Previous experience leading a team responsible for the public cloud estate
    • Azure or GCP Certifications is desirable
    • Experience in handling risks and controls across technical platforms
    • Desire to learn and cross-skill

    Benefits:

    • Up to 15% pension contribution
    • 20% bonus
    • Hybrid working pattern
    • Private Healthcare
    • Access to Share Schemes

    If you are passionate about Cloud Site Reliability and want to be part of a dynamic team shaping the future of Technology, please send your CV, for a confidential discussion. Please note: No Sponsorship is offered