Safeguards Enforcement Analyst, Safety Evaluations

Anthropic 4 hours ago

US $230,000–$270,000/yr

Similar Jobs

AI Agent Architect, Customer Experience

Airtable

US

Prompt Engineering APIs

Trust and Safety Investigator

Muvr

Philippines

Investigations Fraud Risk

Trust & Safety Manager (Remote)

EzCater

US

Compliance Risk Management Policy Development

New Research Lead, Training Insights

Anthropic

Machine Learning Red Teaming

Lead Security Engineer

Fieldguide

Global

Security AWS Python

Responsibilities:

Support model launch readiness by running evaluations
Partner with policy and domain experts throughout the evaluation lifecycle
Work with stakeholders to manage evaluation outcomes

Requirements:

Experience in trust and safety, content operations, policy enforcement
Experience building processes from scratch
Experience operating under high-stakes timelines

Logistics:

Bachelor's degree in a related field or equivalent experience
Location-based hybrid policy: staff to be in one of our offices at least 25% of the time
Visa sponsorship

Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. Their team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Apply for This Position