Software Engineer, Reliability

Added
25 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

docker terraform linux aws python

πŸ“‹ Description

  • Design, build, ship, and maintain core observability libraries and tools.
  • Troubleshoot production issues across the stack for performance, availability, and data quality.
  • Participate in cross-org incident response, driving continuous improvement.
  • Contribute to architectural discussions within the SRE team and cross-functional teams.
  • Influence cross-team projects and the reliability roadmap for engineering and customers.
  • Provide consultation to build highly reliable, efficient, and scalable systems.

🎯 Requirements

  • Bachelor's degree in Computer Science or equivalent practical experience.
  • 2+ years software engineering, including 1+ year in reliability, scalability, distributed systems.
  • Proficiency in Python (preferred), Go, or Ruby in Linux environments; microservices, async, APIs.
  • Experience deploying prod systems on AWS or Azure using Kubernetes, Docker, Terraform.
  • Observability and incident response with Datadog, Splunk, Grafana, Prometheus, OpenTelemetry.
  • Strong collaboration, documentation, and communication; lead small projects.

🎁 Benefits

  • Fast-paced, collaborative environment.
  • Learning and development allowance.
  • Competitive cash and equity, and growth opportunities.
  • 100% medical, dental, and vision coverage.
  • Up to $25K for fertility/adoption/parental planning.
  • Flexible PTO and monthly wellness stipend.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’