Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills

Tailors your resume and cover letter automatically

Works 24/7—so you don't have to

Own one or more reliability domains end-to-end (observability, incidents, performance).
Drive and refine modern SRE practices across services (SLIs/SLOs, error budgets, reviews).
Lead multi-sprint, multi-engineer reliability initiatives with cross-team coordination.
Design and maintain end-to-end observability (metrics, logs, traces, dashboards, alerts).
Partner with product/engineering to design reliable services and influence architecture.
Evolve and operate AWS infrastructure with IaC workflows and contribute code.

8+ years operating complex, production SaaS systems and reliability initiatives.
Proven experience leading multi-sprint, multi-engineer projects with impact.
Experience leading org-wide reliability or performance initiatives end-to-end.
Strong software engineering in Python or Node.js/TypeScript.
Deep expertise in observability and monitoring (Datadog/Prometheus/Grafana).
AWS in production with Terraform and container platforms (ECS/EKS/Kubernetes).
Incident management experience: coordinating incident response and follow-ups.

Staff Site Reliability Engineer

Meet JobCopilot: Your Personal AI Job Hunter