NewsBreak

201-500 employees
7 jobs posted

View company profile →

Please mention that you found this job on empllo.com. Thanks & good luck!

Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills
Tailors your resume and cover letter automatically
Works 24/7—so you don't have to

Activate JobCopilot

Follow us on LinkedIn!

Research Intern, Agent RL Training

Added

9 hours ago

Location

🇺🇸 Mountain View

Type

Internship

Salary

Upgrade to Premium to se...

Related skills

python pytorch llm rlhf rl

📋 Description

Collaborate with mentor to identify high-impact LLM research directions.
Independently run end-to-end SFT experiments on LLM-based agents.
Assist with RL exploration: reward design and training iterations.
Curate high-quality training datasets (instruction-following, agent trajectories, synthetic data).
Contribute to public publications; support top-venue submissions during internship.

🎯 Requirements

Highly motivated and able to put in extra hours as needed.
Genuine passion for research; read papers and tinker with models.
Independently capable of end-to-end model SFT; basic RL post-training methods (RLHF, DPO, PPO, GRPO).
Excellent taste in model behavior; able to reason about what good looks like.
Strong Python and PyTorch skills.

🎁 Benefits

Mentor pairing with a full-time engineer.
Opportunity to contribute to top-tier publications during internship.
Hands-on experience with LLMs, agent RL, and NewsBreak products.
Collaborative, fast-paced research-focused team culture.

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot