Related skills
rust grafana prometheus python kubernetesπ Description
- Design scalable telemetry ingest and storage pipelines for metrics, logs, traces, and errors.
- Own and evolve observability platforms; drive migrations to scale and reduce cost.
- Build instrumentation libraries, SDKs, and integrations to emit high-quality telemetry.
- Drive alerting and SLOs to define and monitor reliability with minimal noise.
- Reduce MTTD/MTTR by cross-signal correlation and AI-assisted tooling.
- Partner with Research/Inference/Product/Infra to tailor observability solutions.
π― Requirements
- 10+ years in large-scale observability infrastructure
- Deep experience in at least one signal area (metrics/logs/traces) with others
- Understand high-throughput pipelines and columnar storage tradeoffs
- Experience with Prometheus, Grafana, ClickHouse, OpenTelemetry
- Proficient in Python, Rust, or Go
- Excellent communication and collaboration skills
- Able to work independently on high-impact, ambiguous infra challenges
π Benefits
- Competitive compensation and benefits
- Optional equity donation matching
- Generous vacation and parental leave
- Flexible working hours
- Lovely office space
π Visa sponsorship
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!