Related skills
java sql python splunk jvm๐ Description
- Lead major incident bridges and restore service with minimal business impact.
- Handle L3 escalations across Java, JVM, middleware, OS, and infra.
- Own technical RCAs; drive long-term remediation.
- Apply SRE principles: SLIs/SLOs, error budgets, resilience patterns.
- Tune JVM parameters; analyze thread/heap dumps; improve performance.
- Influence architecture for fault tolerance, scalability, and recoverability.
๐ฏ Requirements
- Java expertise: JVM internals, GC, memory management.
- .NET: CLR internals, GC, memory management, thread/dump analysis.
- Unix/Linux, networking basics, scripting (Shell/Python/PowerShell/VBS).
- Advanced SQL and DB concepts; Autosys or equivalent scheduler.
- Observability tools: Splunk, AppDynamics, Dynatrace, ELK, Grafana, Prometheus.
- 7-12+ years in App Reliability, Production Support, SRE; CS/Engineering degree.
๐ Benefits
- Hybrid working model.
- On-call and after-hours support.
- Collaborative, fast-paced environment.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!