- Infrastructure Automation: Design, implement, and maintain automated infrastructure provisioning and configuration management using tools like Ansible to ensure consistency and scalability.
- Monitoring and Alerting: Set up monitoring and logging systems to proactively detect and address potential issues, ensuring optimal performance and reliability in environments like on-prem Prometheus/Thanos, Grafana Cloud, and Grafana Cloud Loki.
- Database Management: Manage hundreds of on-prem PostgreSQL databases, including performance tuning, backups, and disaster recovery strategies.
- Collaboration: Work closely with cross-functional teams, including developers and system administrators, to improve the overall development and deployment processes.
- Troubleshooting and Incident Management: Assist in identifying and resolving operational issues and participate in on-call rotations.
- Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).
- Proven experience as a DevOps Engineer or similar role, focusing on building and maintaining scalable infrastructures.
- Strong proficiency in Python for scripting and automation tasks.
- Expertise in configuration management such as Ansible or Puppet.
- Solid understanding of PostgreSQL and experience in managing PostgreSQL databases.
- Hands-on experience with CI/CD tools like Jenkins, GitLab CI, and GitHub Actions.
- Knowledge of containerization technologies like Docker and container orchestration tools like Kubernetes is a plus.
- Understanding of networking concepts such as load balancing and DNS.
- Strong problem-solving skills and the ability to work in a fast-paced, agile environment.
-
Reliability Engineer
1 week ago
Logicalis International Limited Maidenhead, United Kingdom** · Role Summary · **Role Title**: Reliability Engineer · **Reports to**: Global Shared Service Leader · **Location**: Flexible · **Additional details**: Full time, permanent position · **The role · **As a Reliability Engineer in the Shared Service Global team, you will play a c ...
-
Reliability Engineer
2 days ago
IC Resources Oxford, Oxfordshire, United KingdomOur client is currently searching for a Device Reliability Engineer to join the team in Oxford to specify required yield and reliability testing needed to demonstrate that new products are ready for volume manufacture. The role will involve working with vendors to run the require ...
-
Reliability Engineering Manager
1 week ago
AWE Plc Reading, United KingdomReliability Engineering Manager · Location: Aldermaston, Berkshire · Package: £44,950 - £66,000 (depending on your suitability and level of experience) · As part of our People Promise, AWE (one of the best 25 big companies to work for in the UK) has a range of benefits to suit yo ...
-
Maintenance & Reliability Engineer
1 week ago
AWE Plc Reading, United KingdomMaintenance & Reliability Engineer · Location: Reading / Basingstoke Area · Package: £35,720 - £50,000 (depending on suitability) · As part of our People Promise, AWE (one of the best 25 big companies to work for in the UK) has a range of benefits to suit you. These include: · Ti ...
-
Site Reliability Engineer
3 days ago
Saint Gobain Building Distribution Newbury, United KingdomWe have an exciting opportunity for **Site Reliability Engineer** with Saint-Gobain Building and Distribution Digital Centre in Newbury. This is a new role operating hybrid (3 days office), and work from home. We are looking for a great engineer who has a passion for innovative p ...
-
Site Reliability Engineer
2 weeks ago
Saint Gobain Building Distribution Newbury, United KingdomWe have an exciting opportunity for **Site Reliability Engineer** with Saint-Gobain Building and Distribution Digital Centre in Newbury. This is a new role operating hybrid with average of three days in the office. · Reporting to the Head of QA & DevOps, this will involve taking ...
-
Site Reliability Engineer
1 week ago
Noa Recruitment Milton Keynes, United KingdomSite Reliability Engineer - Milton Keynes / Remote UK - £65,000 plus Package · We are helping one of Europe's fastest growing technology companies make a number of tech hires. Due to continued growth and demand for their products they now urgently need a Site Reliability Engineer ...
-
Device Reliability Engineer
1 day ago
IC Resources Oxford, Oxfordshire, United KingdomOur client is currently searching for a Device Reliability Engineer to join the team in Oxford to specify required yield and reliability testing needed to demonstrate that new products are ready for volume manufacture. The role will involve working with vendors to run the require ...
-
Device Reliability Engineer
1 week ago
IC Resources Oxford, United Kingdom Full timeOur client is currently searching for a Device Reliability Engineer to join the team in Oxford to specify required yield and reliability testing needed to demonstrate that new products are ready for volume manufacture. The role will involve working with vendors to run the require ...
-
reliability engineer
2 days ago
ARCA Resourcing Ltd Kidlington, United Kingdom Permanent, Full timeReliability Engineer / DevOps Engineer · Python, Docker, Kubernetes · As a Site Reliability Engineer, you will be at the heart of groundbreaking projects, ensuring operational reliability and accelerating code velocity for highly innovative devices. If you're passionate about ble ...
-
Device Reliability Engineer
2 weeks ago
IC Resources Oxford, United Kingdom Full timeOur client is currently searching for a Device Reliability Engineer to join the team in Oxford to specify required yield and reliability testing needed to demonstrate that new products are ready for volume manufacture. The role will involve working with vendors to run the require ...
-
Electrical Reliability Test Engineer
6 days ago
Venn Group Milton Keynes, United KingdomOur client, a major global organisation, urgently require an experienced Design Engineer to undertake a long term contract. · In order to be successful, you will have the following experience: · - Experienced within electrical automotive testing (Whole vehicle) · - Understanding ...
-
reliability engineer
2 days ago
ARCA Resourcing Ltd Kidlington, United Kingdom Full timeReliability Engineer / DevOps EngineerPython, Docker, KubernetesAs a Site Reliability Engineer, you will be at the heart of groundbreaking projects, ensuring operational reliability and accelerating code velocity for highly innovative devices. If you're passionate about blending ...
-
Site Reliability Engineer/DevOps
22 hours ago
Infleqtion Oxford, United Kingdom Full timeInfleqtion delivers high-value quantum information precisely where it is needed. By operating at the Edge, our software-configured, quantum-enabled products deliver unmatched levels of precision and power, generating streams of high-value information from commercial organizations ...
-
Principal Site Reliability Engineer
3 days ago
Tripadvisor Oxford, United Kingdom CDIWe believe that we are better together, and at Tripadvisor we welcome you for who you are. Our workplace is for everyone, as is our people powered platform. At Tripadvisor, we want you to bring your unique perspective and experiences, so we can collectively revolutionize travel a ...
-
Site Reliability Engineer/DevOps
3 days ago
Infleqtion Oxford, United Kingdom Full timeInfleqtion delivers high-value quantum information precisely where it is needed. By operating at the Edge, our software-configured, quantum-enabled products deliver unmatched levels of precision and power, generating streams of high-value information from commercial organizations ...
-
Principal Site Reliability Engineer
1 week ago
Tripadvisor Oxford, United Kingdom EmployeeWe believe that we are better together, and at Tripadvisor we welcome you for who you are. Our workplace is for everyone, as is our people powered platform. At Tripadvisor, we want you to bring your unique perspective and experiences, so we can collectively revolutionize travel a ...
-
SYSTEMS RELIABILITY ENGINEER
2 days ago
ARCA Resourcing Ltd Kidlington, United Kingdom Full timeSystem Reliability Engineer (Python, Docker &/or Kubernetes) HYBRIDHYBRIDPython, Docker, KubernetesAs a Site Reliability Engineer, you will be at the heart of groundbreaking projects, ensuring operational reliability and accelerating code velocity for highly innovative devices. I ...
-
Reliability Engineer
3 days ago
Avara Foods Brackley, United Kingdom Permanent ContractAbout the Role: · Who are we, and what do we do ? · You may not have heard of us, but there's a good chance you've enjoyed our products. Avara Foods is one of the UK's leading food businesses, supplying chicken and turkey to the country's major supermarkets and well-known resta ...
-
Reliability Engineer
1 day ago
Avara Foods Brackley, Northamptonshire, United KingdomAbout the Role: Who are we, and what do we do? · You may not have heard of us, but there's a good chance you've enjoyed our products. Avara Foods is one of the UK's leading food businesses, supplying chicken and turkey to the country's major supermarkets and well-known restaurant ...
Principal Site Reliability Engineer - Oxford, United Kingdom - Tripadvisor
Description
We believe that we are better together, and at Tripadvisor we welcome you for who you are. Our workplace is for everyone, as is our people powered platform. At Tripadvisor, we want you to bring your unique perspective and experiences, so we can collectively revolutionize travel and together find the good out there.
Tripadvisor captured the online travel market 20 years ago as a Boston-based startup before an online travel market existed. The fact that we still dominate the industry proves that we know how to operate a fast-moving technology company and hire the right people who allow us to maintain that lead throughout the many advancements in technology. As we enter the era of Large Language Models and mobile-based internet everywhere, we are poised to innovate again. As a Tripadvisor Engineer, you will work with some of the best and brightest minds that technology offers and learn best practices and engineering methodologies that will empower you for the rest of your career.
The Site Operations team at Tripadvisor maintains and enhances the core systems that power and support the website. This includes systems in private data centers and over a hundred accounts in AWS. Our scope of responsibilities is vast, and listing them here would take an entire page. Suffice it to say that we are the go-to team for questions about the interface boundaries between these two halves of the company and the deep inner workings of our infrastructure.
As a Site Operations Engineer on the SiteOps team, you will be a force multiplier for our engineering and operations teams, delivering tooling & infrastructure that not only has a direct impact on day-to-day operations but also helps contribute to the future evolution of infrastructure and engineering here at Tripadvisor. You'll be part of a dynamic team responsible for ensuring our services' high availability, reliability, and scalability. We seek passionate engineers with experience in Python, Java, Ansible, PostgreSQL, CentOS, and Alma Linux to help us optimize and automate our infrastructure and deployment processes. We are currently involved in several types of systems migrations, within both the scope of on-prem to AWS/cloud-native migrations and on-prem data centers to alternate on-prem data center migrations. As a SiteOps Engineer, you will be involved in designing and implementing how we perform those migrations, testing them, and then performing them with a "no surprises in production" mindset.
What You'll Do:
Skills and Experience:
If you need a reasonable accommodation or support during the application or the recruiting process due to a medical condition or disability, please reach out to your individual recruiter or send an email to and let us know the nature of your request . Please include the job requisition number in your message.
#LI-AMCVAY
#LI-Remote
#LI-Hybrid