Jobs
>
London

    Senior Manager, Site Reliability - United Kingdom - Cambium Learning Group

    Default job background
    Description

    Senior Manager, Site Reliability page is loaded

    Senior Manager, Site Reliability

    Apply locations Remote time type Full time posted on Posted 3 Days Ago job requisition id REQ-3351

    Overview:

    As the Senior Manager of Site Reliability, you will play a crucial role in ensuring the stability, performance, and security of our SaaS applications. You will lead a team of skilled professionals responsible for maintaining and enhancing the reliability of our systems through robust observability, monitoring, threat detection, and mitigation strategies. The ideal candidate will bring extensive experience in managing complex SaaS environments and a deep understanding of best practices in site reliability engineering.

    Job Responsibilities

    Team Leadership:

      • Lead and mentor a team of site reliability engineers to ensure a high level of expertise and efficiency.
      • Drive initiatives to enhance the technical skills and efficiency of the team.
      • Foster a culture of collaboration, innovation, and continuous improvement.

    Hands-On Technical Leadership:

      • Actively contribute to the design, implementation, and maintenance of observability, monitoring, and security systems.
      • Lead by example, working hands-on to troubleshoot issues and optimize system performance.

    Observability and Monitoring:

      • Develop and implement comprehensive observability and monitoring strategies to proactively identify and address potential issues before they impact system performance.
      • Collaborate with development leadership to improve performance and scalability of services developed by providing relevant and actionable metrics in early stages of development.
      • Utilize industry-leading tools and practices to maintain visibility into the health and performance of our systems.

    Threat Detection and Mitigation:

      • Design and implement robust security measures to detect and mitigate potential threats to our SaaS infrastructure.
      • Stay informed about the latest cybersecurity threats and trends, and implement proactive measures to safeguard our systems.

    Incident Response:

      • Actively participate in incident response activities, leading the team to quickly resolve and learn from incidents.
      • Develop and maintain incident response plans to ensure a rapid and effective response to any service interruptions or security incidents.
      • Conduct post-incident analyses to identify root causes and implement preventive measures.

    Infrastructure Optimization:

      • Collaborate with cross-functional teams to optimize the performance and scalability of our infrastructure.
      • Implement automation and efficiency improvements to enhance overall system reliability.

    Job Requirements

    • Bachelor's degree in Computer Science, Information Technology, or a related field.
    • Proven hands-on experience (5+ years) in a site reliability engineering or similar role.
    • Leadership experience (3+ years) with a focus on technical mentorship and skill development.
    • In-depth knowledge of observability tools, monitoring systems, and security best practices.
    • Proven leadership and team management skills.
    • Excellent problem-solving and communication abilities.
    • In-depth experience with AWS.

    To learn more about our organization and the exciting work we do, visit

    An Equal Opportunity Employer

    We are dedicated to fostering a culture that celebrates unique backgrounds, ideas, and experiences. All qualified applicants will receive consideration for employment without discrimination on the basis of race, color, age, religion, sex, gender, gender identity/expression, sexual orientation, national origin, protected veteran status, or disability.

    About Us

    Simplicity - Across all our teams and all areas of our business, we create simplicity, making things easier and more clear for all those we work with.

    Certainty - We continually strive to eliminate doubt, delivering solutions, services and communications that our customers know they can count on.

    Now - We understand the need to make a difference not only for the future, but for today, and our people are committed to making the most of each moment we spend serving our customers.

    #J-18808-Ljbffr


  • Lorien London, United Kingdom

    Site Reliability Engineer · Location: London (hybrid remote working) · **Salary**: Up to £100,000 + Very Generous Benefits Package · One of the fastest growing ecommerce organisation requires a Site Reliability Engineer to help be the glue between the companies Dev, QA and Produc ...


  • Austin Werner Ltd London, United Kingdom

    Site Reliability Engineer - Global Media/Publishing business · We are seeking a Site Reliability Engineer for a globally leading Publishing business based in London. · My client has built their internal IT environment from ground up so is bespoke to the business with cutting edge ...


  • Explore Group London, United Kingdom

    **Lead Site reliability engineer - Fully remote - No sponsorship offered** · Role: Site Reliability engineer · Location: Fully remote · **Salary**: Up to £115,000 · **Responsibilities**: · - Design, build, and maintain scalable and highly available infrastructure on AWS · - Imple ...


  • McDonald's Limited London, United Kingdom

    **The Opportunity**: · **The Opportunity** · An exciting opportunity to work as part of the Service Operations Team, the Site Reliability Officer will be responsible for improving the value of IT to the business by reducing the occurrence of systematic issues within our services. ...


  • Lorien London, United Kingdom

    Site Reliability Engineer · Location: London (hybrid remote working) · **Salary**: Up to £100,000 + Very Generous Benefits Package · One of the fastest growing ecommerce organisation requires a Site Reliability Engineer to help be the glue between the companies Dev, QA and Produc ...


  • Involved Solutions London, United Kingdom

    **Site Reliability Engineer - 12 Month Contract - SC Cleared** · **Rate**: Up to £750 per day · **Location**: Remote - 1 day per week in either London, Manchester or Bristol (whichever is closest to your home location) · **IR35**: Inside · **The role**: · Senior Site Reliability ...


  • Nigel Frank International London, United Kingdom

    **Site Reliability Engineer/Team Manager - Hybrid - Up to £110,000.** · I am working with an insurance and technology consultancy who provide data-driven insight-let solutions to their customers to help them become more resilient and get the best possible performance for their bu ...


  • McDonald's Limited London, United Kingdom

    **The Opportunity**: · **The Opportunity** · An exciting opportunity to work as part of the Service Operations Team, the Site Reliability Officer will be responsible for improving the value of IT to the business by reducing the occurrence of systematic issues within our services. ...


  • McDonald's Limited London, United Kingdom

    **The Opportunity**: · **Hybrid Working** · This role is based in our East Finchley office working 3 days in the office and 2 days remotely. · **The Opportunity** · An exciting opportunity to work as part of the Service Operations Team, the Site Reliability Officer will be respon ...


  • eFinancialCareers London, United Kingdom

    Join us as a Site Reliability Engineer · - We'll look to you to provide technical support for relevant platforms, activities, and processes relating to areas of your specialist knowledge · - You'll assist with creating and implementing effective and efficient ITSM processes, whil ...


  • Evermore Global London, United Kingdom

    **Site Reliability Engineer / Linux / VMWARE/ Elastic Search /** · **Location: Central London / Hybrid** · **Salary: Circa £80,000 + Benefits** · **Permanent** · World leading online media company are seeking a suitable Site Reliability Engineer to join their expanding team in Lo ...


  • McDonald's Limited London, United Kingdom

    **The Opportunity**: · **The Opportunity** · An exciting opportunity to work as part of the Service Operations Team, the Site Reliability Officer will be responsible for improving the value of IT to the business by reducing the occurrence of systematic issues within our services. ...


  • Lorien London, United Kingdom

    This London based company strive to create a world class digital hub for their clients. They are currently hiring for a Site Reliability Engineer with good experience maintaining AWS infrastructure. This position is fully remote, but the office is open ifyou would like to go in. ...


  • Experis LTD London, United Kingdom

    Responsibilities:_ · - Manage and monitor AWS infrastructure, particularly Lambda functions, to ensure the availability and reliability of services._ · - Develop and maintain infrastructure automation and configuration management tools to support a rapidly changing environment._ ...


  • eFinancialCareers London, United Kingdom

    Site Reliabilty Engineer Responsibilities: · - Own critical parts of our software development life-cycle such as build/deploy · - Facilitate individual development teams to build best-in-class cloud-native solutions · Site Reliabilty Engineer Requirements: · - Experience in an em ...


  • NonStop Consulting Ltd London, United Kingdom

    Hi all, we are currently recruiting for Digital Site Reliability Engineer to join Government Department on a contract for 6 months, fully remote work. · Essentials skills: · - experience with Terraform, CI, CD; · - leading assessments; · - programming; · - eligibility for SC Clea ...


  • eFinancialCareers London, United Kingdom

    Join us as a Senior Site Reliability Engineer · - We'll look to you to establish and run a SRE function to help design, build, deliver and run highly reliable, scalable and secure software systems · - This is a great opportunity to hone your existing engineering skills and advanc ...


  • eFinancialCareers London, United Kingdom

    Join us as a Streaming Site Reliability Engineer · - This is an exciting opportunity to use your technical expertise and collaborate with our colleagues to build effortless, digital first customer experiences · - Working in our Data & Analytics Service function, you'll collaborat ...


  • eFinancialCareers London, United Kingdom

    We are responsible for life cycle management of the network architecture including planning, automation, implementation and monitoring. We work closely with other Engineering teams, the Chief Technology Office (CTO) and product managers across the enterprise.The NSRE team drives ...


  • NonStop Consulting Ltd London, United Kingdom

    This is an 6 months contract and mostly remote · *Due to the nature of the assignment details need to remain vague at this point, but the central requirements are: · Eligibility for getting SC Cleared · DevOps Engineering/ Cloud Engineering/ Infrastructure Engineering experience ...