Staff Engineer, Platform Engineering & Operational Health

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

node.js docker terraform aws python

πŸ“‹ Description

  • Conduct audits of infrastructure, deployments, incidents, and on-call patterns.
  • Identify root causes: fragile systems, deployment friction, observability gaps.
  • Establish baselines and metrics for system health and efficiency.
  • Design and implement systemic fixes; eliminate single points of failure.
  • Build observability: dashboards, logs, metrics, tracing; automate playbooks.
  • Own reliability roadmap; mentor engineers; drive architecture reviews.

🎯 Requirements

  • 5–8+ years in software engineering with infra/SRE focus.
  • Staff-level ownership of platform decisions at scale.
  • Terraform IaC with multi-environment deployments.
  • Docker/Kubernetes and deployment pipelines experience.
  • Production Node.js and TypeScript; AWS stack (EC2, RDS, Lambda).
  • Observability stack (monitoring/logging/metrics/tracing) + automation (Python/Go/Bash).

🎁 Benefits

  • Work with a talented but humble team.
  • Competitive compensation and equity.
  • Weekly paid family meal.
  • 401k, medical, dental, and vision insurance.
  • Flexible PTO; 15.5 paid holidays incl. Juneteenth and MLK Day.
  • Remote work stipend of $300.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’