Site reliability engineer (UK)

Added
4 days ago
Type
Full time
Salary
Salary not provided

Related skills

docker terraform aws grafana prometheus

๐Ÿ“‹ Description

  • Automate ops and infra tools with Python or Go to reduce toil
  • Design scalable, fault-tolerant infra on AWS/GCP/Azure
  • Own reliability, performance, and SLOs for core services
  • Own observability stack for monitoring, logging, alerting
  • Lead incident response, post-mortems, and root-cause analyses
  • Collaborate with product/engineering on reliability and scale

๐ŸŽฏ Requirements

  • 7+ years in SRE/DevOps or similar, building/operating large-scale, highly available systems
  • Deep expertise with AWS, Docker, Kubernetes, and Terraform
  • Strong proficiency in Python, Java, or Go
  • Knowledge of Prometheus, Grafana, ELK Stack
  • Challenge status quo, identify weaknesses, and propose innovative reliability solutions
  • Excellent communication and collaboration; ability to connect with cross-functional teams

๐ŸŽ Benefits

  • Generous PTO, plus company holidays
  • Medical and dental insurance
  • Paid parental leave for all parents (12 weeks)
  • Fertility and family planning support
  • Early-detection cancer testing through Galleri
  • Pension scheme and company contribution
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’