Sre Engineer - London, United Kingdom - eFinancialCareers
Description
TEKsystems is currently engaged with a financial services company to recruit Site Reliability Engineer. who will be responsible for delivering continuous improvement, automation and self-service offerings to operational teams across company.Primary:
- Develop software to make infrastructure services selfmanaging and selfservice
- Deliver continuous service improvement by developing Infrastructure as Code
- Eliminate manual, repetitive, automatable, tactical tasks that are devoid from value
- Improve system performance, make effective use of resources, distribute load and reduce latency
- Identify SLO's (Service Level Objectives) to meet availability and latency objectives
- Develop proactive monitoring solutions that alert on symptoms and not just on outages
- Perform detailed root cause analysis (RCA's) on incidents and outages to prevent future
- Partner with development teams to improve services via rigorous testing and release procedures
- Develop standard operational procedures and produce effective documentation
- Analyse workloads and devise suitable cloud migration strategies where appropriate
- Ensure all project / investment workloads are delivered according to plans and budget defined
- Liaise with Infrastructure Control and IT Risk teams to satisfy internal and external audit requests
- Deputise for team lead when required to do so and actup accordingly
- Identify cost saving and optimisation opportunities across the group
- Build strong working relationships across the organisation
- Adhere to the core values of the bank
Secondary:
- Perform daily health and compliance checks for all systems as required
- Ensure all systems are backed up successfully and any issues are promptly resolved
- Validate monitoring alerts and batch job failures are detected promptly and satisfactorily resolved
- Ensure sufficient capacity is available to accommodate drive growth
- Handle incidents and requests with efficiency and a "customer first" mindset
- Maintain infrastructure in a highly available, reliable, secure and performant manner
Essential:
- AWX / Ansible Tower
- Git, Ansible, Terraform and TeamCity
- Serena Deployment Automation (SDA) and Jenkins
- Kubernetes and Docker
- "Continuous Integration (CI) and Continuous Development (CD)" Principles and practices
- Agile, Site Reliability Engineering (SRE) and DevOps Principles and practices
- Scripting and programming languages such as PowerShell, Python, Bash and C#
- Fluent in Backup and Recovery processes and procedures
- Advanced knowledge of Clustering, High-Availability, Replication and Disaster Recovery techniques
- Ability to tune Network, Storage, Server and Virtualisation layers for optimal performance and reliability
- Excellent Performance Tuning skills, indepth knowledge of system internals
- Ability to interpret and implement CIS security hardening recommendations in a controlled manner
Employee Value Proposition:
hybrid
Job Title:
SRE Engineer
Location:
London, UK
Rate/Salary:
GBP Daily
Job Type:
Contract
More jobs from eFinancialCareers
-
Senior Events Manager
London, United Kingdom - 2 weeks ago
-
Java Developer
London, United Kingdom - 2 weeks ago
-
2023 Quantitative Research Analyst
London, United Kingdom - 2 weeks ago
-
Lead Representative Non Supervisory, Fund/client
Manchester, United Kingdom - 2 weeks ago
-
Investment Reporting and Research Analyst
London, United Kingdom - 2 weeks ago
-
Private Client Tax Senior Associate
London, United Kingdom - 3 weeks ago