Related skills
datadog grafana prometheus splunk elk stackπ Description
- On-call 24/7 monitoring; observe platform and merchant performance.
- Coordinate incident mitigation, recovery, and resolution with teams.
- Communicate with merchants in real time during incidents.
- Analyze incident trends and drive long-term fixes.
- Collaborate with Operations, Product, and Engineering to improve monitoring.
- Investigate alerts and improve logging and alerts.
π― Requirements
- 5+ years in incident management, problem mgmt, and platform monitoring.
- Problem mgmt experience: trend analysis, root cause, prevention.
- Strong communication; translate tech for diverse audiences.
- Willingness to participate in on-call rotation.
- Experience with Prometheus, Grafana, ELK Stack.
- Observability tools: Datadog, Dynatrace, Splunk.
π Benefits
- Office-based in Amsterdam; in-person collaboration.
- Opportunities to own projects and grow your career.
- Diversity and inclusive culture with supportive atmosphere.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!