Anthropic

216 jobs posted

View company profile →

Please mention that you found this job on empllo.com. Thanks & good luck!

Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills
Tailors your resume and cover letter automatically
Works 24/7—so you don't have to

Activate JobCopilot

Follow us on LinkedIn!

Senior Software Engineer, AI Reliability Engineering

Added

26 days ago

Location

🇮🇪 Dublin

Type

Full time

Salary

Upgrade to Premium to se...

Related skills

distributed systems monitoring infrastructure observability gpu

📋 Description

Develop SLOs for LLM serving and training
Design and implement monitoring for availability and latency
Build high-availability model serving infra for millions of users
Create automated failover and recovery across regions and clouds
Lead incident response for critical AI services
Optimize costs focusing on GPU/TPU/Trainium utilization

🎯 Requirements

Extensive experience with distributed systems observability and monitoring at scale
Experience operating AI infrastructure, including model serving, batch inference, and training
Proven track record implementing and maintaining SLO/SLA frameworks
Comfort with traditional metrics (latency, availability) and AI metrics
Experience with chaos engineering and resilience testing
Bridge the gap between ML engineers and infrastructure teams

🎁 Benefits

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Dublin office with a collaborative workspace
Global, mission-driven culture

🛃 Visa sponsorship

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot