Related skills
datadog distributed systems ai observability instrumentationπ Description
- Lead and grow a team for reliability and observability platforms
- Own core observability stack (Datadog) with high data quality
- Define instrumentation standards and libraries strategy
- Explore AI-driven anomaly detection and automation
- Establish cost attribution, budgeting, and alerting across infra
- Partner with infrastructure, product engineering, finance, and security
π― Requirements
- 4+ years leading infrastructure/observability teams
- Hands-on with Datadog and OpenTelemetry across metrics, logs, traces
- Strong understanding of distributed systems, instrumentation, SLOs
- Experience with cost attribution, budgeting, forecasting, alerts
- Ability to set technical direction and align with Engineering, Finance, Security
- Familiarity with AI/ML for anomaly detection and automation
π Benefits
- Equity and competitive benefits
- Health, dental, vision, retirement contributions
- Parental leave and family planning support
- Mental health and wellness benefits, PTO and recharge days
- Learning stipend and work-from-home stipend and cell phone reimbursement
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!