RESEARCH ENGINEER, REWARD MODELS

Anthropic
Full-time
San Francisco, CA
$315,000 - $510,000
Posted on 5 months ago

Job Description

The Reward Modeling team at Anthropic is developing techniques for teaching AI systems to understand and embody human values, as well as to push forward AI capabilities. They are looking for engineers to join their efforts to push forward the science of reward modeling.

Responsibilities

  • Implement novel reward modeling architectures and techniques
  • Optimize training pipelines
  • Build and optimize data pipelines
  • Collaborate across teams to integrate reward modeling advances into production systems
  • Communicate engineering progress through internal documentation and potential publications

Requirements

  • Strong engineering background in machine learning, with expertise in preference learning, reinforcement learning, deep learning, or related areas
  • Proficiency in Python, deep learning frameworks, and distributed computing
  • Familiarity with modern LLM architectures and alignment techniques
  • Experience with improving model training pipelines and building data pipelines
  • Comfortable with the experimental nature of frontier AI research
  • Willing to implement some research ideas
  • Can clearly communicate complex technical concepts and research findings
  • Deep interest in AI alignment and safety
  • Proficiency in Python and experience with deep learning frameworks

Benefits

  • No benefits