Forward Deployed Site Reliability Engineer

Added
less than a minute ago
Type
Full time
Salary
Salary not provided

Related skills

docker terraform aws grafana docker compose

📋 Description

  • On-site reliability engineer at a government site; define SLIs/SLOs.
  • Lead on-site incident response in restricted air-gapped AWS.
  • Own observability for on-site deployment (LGTM stack: Grafana, Loki, Tempo, Mimir).
  • Manage deployment and infra: Docker, Docker Compose, Terraform in enclave.
  • Automate toil; post-incident reviews and durable fixes.
  • Liaise with government customers; translate ops needs to engineering.

🎯 Requirements

  • 5+ years in SRE/prod ops or related infra role.
  • Define/track SLIs/SLOs and error budgets; incident response experience.
  • Docker, Docker Compose; AWS (EC2, ECS, RDS, VPCs).
  • Linux/Unix admin; productive in constrained envs without GUI.
  • Terraform for infra provisioning within guardrails.
  • LGTM stack (Grafana, Loki, Tempo, Mimir) experience.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →