Site Reliability Engineer (SRE)

Added
less than a minute ago
Type
Full time
Salary
Salary not provided

Related skills

bigquery terraform github actions kubernetes api gateway

๐Ÿ“‹ Description

  • Own observability and reliability of our platform end-to-end
  • Build dashboards and alerts in Google Cloud Monitoring
  • Define SLOs/SLIs for API gateway, apps, and identity
  • Instrument services with tracing and structured logging
  • Own incident response during on-call shifts
  • Reduce toil with automation and runbooks

๐ŸŽฏ Requirements

  • Solid production experience on GCP (or AWS/Azure with GCP ramp)
  • On-call incidents, postmortems, and action items
  • Strong observability: SLOs, logs, alerts, dashboards
  • Kubernetes, API gateways, identity systems, and at least one IaC tool
  • Scripting in Python/Go/Bash for automation
  • Clear written communication and runbooks

๐ŸŽ Benefits

  • Foundational role shaping delivery practice from day one
  • Collaborate with a Delivery Manager as key partner
  • Equal opportunity employer valuing diversity
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’