Staff+ Software Engineer, Observability

Added
28 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

rust grafana prometheus python kubernetes

πŸ“‹ Description

  • Design scalable telemetry ingest and storage pipelines for metrics, logs, traces, and errors.
  • Own and evolve observability platforms; drive migrations to scale and reduce cost.
  • Build instrumentation libraries, SDKs, and integrations to emit high-quality telemetry.
  • Drive alerting and SLOs to define and monitor reliability with minimal noise.
  • Reduce MTTD/MTTR by cross-signal correlation and AI-assisted tooling.
  • Partner with Research/Inference/Product/Infra to tailor observability solutions.

🎯 Requirements

  • 10+ years in large-scale observability infrastructure
  • Deep experience in at least one signal area (metrics/logs/traces) with others
  • Understand high-throughput pipelines and columnar storage tradeoffs
  • Experience with Prometheus, Grafana, ClickHouse, OpenTelemetry
  • Proficient in Python, Rust, or Go
  • Excellent communication and collaboration skills
  • Able to work independently on high-impact, ambiguous infra challenges

🎁 Benefits

  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Lovely office space

πŸ›ƒ Visa sponsorship

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’