Similar Jobs

See all

Work Details:

  • Design and execute complex jailbreak attempts to identify vulnerabilities in state-of-the-art models.
  • Use your background in linguistics or social sciences to find "hidden" biases or harms that standard automated filters miss.
  • Model Evaluation: Systematically rank LLM outputs to determine where safety guardrails are failing or succeeding.

Who you Are:

  • Heavy LLM Usage — hands-on experience with multiple models (open- and closed-source), comfort experimenting across systems and platforms.
  • You have a "hacker mindset." You enjoy the puzzle of finding edge cases and can think of ten different ways to ask a forbidden question.
  • You can turn a chaotic afternoon of prompt-hacking into a clean, actionable report.

Our Team

We are building safer, more robust intelligence. We appear to be a small team with a culture that values asynchronous work and self-starters.

Apply for This Position