Staff+ Software Engineer, Observability

Added
1 minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

grafana prometheus python kubernetes go

πŸ“‹ Description

  • Design and build scalable telemetry ingest and storage pipelines across multi-cluster infra.
  • Own and evolve core observability platforms; drive migrations and improvements.
  • Build instrumentation libraries, SDKs, and integrations for high-quality telemetry.
  • Drive alerting and SLO infrastructure to monitor reliability targets with minimal noise.
  • Reduce MTTR via cross-signal correlation and unified query interfaces.
  • Partner with Research, Inference, Product, and Infrastructure teams to tailor observability.

🎯 Requirements

  • 10+ years of experience building and operating large-scale observability infrastructure.
  • Deep experience with at least one observability signal area (metrics, logs, tracing, or analytics).
  • Understand high-throughput data pipelines, columnar storage, and telemetry data at scale.
  • Experience operating or building on Prometheus, Grafana, ClickHouse, OpenTelemetry, or similar.
  • Strong proficiency in Python, Rust, or Go and ability to collaborate effectively.
  • Excited about building foundational infrastructure and comfortable with ambiguous, high-impact challenges.

🎁 Benefits

  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Lovely office space to collaborate with colleagues

πŸ›ƒ Visa sponsorship

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’