Added
44 minutes ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
java sql python splunk jvmπ Description
- End-to-end reliability, resilience, and performance of critical apps; final escalation point.
- Lead major incidents; deep diagnostics across Java systems, JVM, middleware, OS.
- Drive RCAs; long-term remediation; identify recurring failures and risks.
- Apply SRE principles: SLIs/SLOs, error budgets, resilience; tune JVM parameters.
- Change, Release, and Risk: technical approvals; operational readiness; audits.
- Automation, Monitoring & Observability: Shell/Python/PowerShell automation; health checks.
π― Requirements
- Strong knowledge of application architecture, distributed systems, and middleware.
- Java expertise: JVM internals, GC, memory management, thread/heap dumps.
- Unix/Linux proficiency; networking basics; scripting with Shell, Python, PowerShell.
- Advanced SQL and databases; Autosys or equivalent scheduler.
- Hands-on with observability tools: Splunk, AppDynamics, Dynatrace, ELK.
- Major incident leadership, deep RCA, change/release readiness, DR.
- Experience in regulated production environments; ITIL/cloud certifications (preferred).
π Benefits
- Unlimited Paid Days Off
- Health plan options
- 401k with company match
- Dental, vision, disability, life coverage
- Family Forming Benefit including fertility coverage
- Paid childbearing and paternal leave
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!