ForHire

RESEARCH ENGINEER, REWARD MODELS

Anthropic

Full-time

San Francisco, CA

$315,000 - $510,000

Posted on 5 months ago

Job Description

The Reward Modeling team at Anthropic is developing techniques for teaching AI systems to understand and embody human values, as well as to push forward AI capabilities. They are looking for engineers to join their efforts to push forward the science of reward modeling.

Responsibilities

Implement novel reward modeling architectures and techniques
Optimize training pipelines
Build and optimize data pipelines
Collaborate across teams to integrate reward modeling advances into production systems
Communicate engineering progress through internal documentation and potential publications

Requirements

Strong engineering background in machine learning, with expertise in preference learning, reinforcement learning, deep learning, or related areas
Proficiency in Python, deep learning frameworks, and distributed computing
Familiarity with modern LLM architectures and alignment techniques
Experience with improving model training pipelines and building data pipelines
Comfortable with the experimental nature of frontier AI research
Willing to implement some research ideas
Can clearly communicate complex technical concepts and research findings
Deep interest in AI alignment and safety
Proficiency in Python and experience with deep learning frameworks

Benefits

No benefits