Source Job

Global

  • Design and execute complex jailbreak attempts to identify vulnerabilities in state-of-the-art models.
  • Use your background in linguistics or social sciences to find "hidden" biases or harms that standard automated filters miss.
  • Systematically rank LLM outputs to determine where safety guardrails are failing or succeeding.

LLMs Linguistics

20 jobs similar to Adversarial Prompt Expert

Jobs ranked by similarity.

$196,000–$294,000/yr
US

  • Analyze threat actor behavior and evolving abuse patterns to inform detection logic.
  • Research, prototype, and implement LLM-driven techniques for abuse detection.
  • Design and develop production-ready systems that detect and disrupt abusive behavior.

Vercel provides developers with tools and cloud infrastructure to build, scale, and secure faster, more personalized web experiences. They have a mission to enable the world to ship the best products and aim to create a place where everyone can do their best work.

$90,000–$110,000/yr
US

  • Design and refine prompts that power AI features.
  • Test AI outputs for accuracy and customer value.
  • Identify and resolve issues with AI behavior.

CentralReach is a leading provider of autism and IDD care software for Applied Behavior Analysis (ABA), multidisciplinary therapy, and special education. With over 200,000 users and backed by Roper Technologies, Inc., they are entering an exciting phase of growth and innovation.

$110,720–$138,400/yr
US

  • Design, develop, and deploy LLM- and RAG-powered applications that enhance analyst and hacker productivity across offensive security use cases.
  • Architect and maintain large-scale, high-performance data pipelines to process vulnerability, asset, and activity datasets from multiple sources.
  • Collaborate with security researchers and engineers to translate offensive security workflows into data-driven automation.

Bugcrowd empowers organizations to take back control and stay ahead of threat actors. With a network of hackers, Bugcrowd brings diverse expertise to uncover hidden weaknesses and adapts swiftly to evolving threats.

Global

  • Help train and evaluate cutting-edge AI models using real legal expertise.
  • Complete AI training tasks such as analyzing, editing, and writing annotations, grounded in legal reasoning and professional practice.
  • Judge the performance of AI in performing legal tasks and improve cutting-edge AI models by providing expert feedback.

Prolific is building the biggest pool of quality human data in the world. Over 35,000 AI developers, researchers, and organizations use Prolific to gather data from paid study participants with a wide variety of experiences, knowledge, and skills.

  • Build, optimize, and evolve RAG pipelines.
  • Develop prompts and guardrails for domain-specific LLM applications.
  • Implement hallucination detection, mitigation, and fact-checking mechanisms.

Robots & Pencils builds meaningful, scalable digital products by blending strategy, design, and engineering. They are a small, senior team with direct access to enterprise clients.

Global

  • Lead secure design reviews, threat modeling, and security-focused code reviews across the product and platform.
  • Build and run Fieldguide’s vulnerability management program: scanning, triage, SLA-driven remediation tracking, and engineering coordination.
  • Partner with Compliance to ensure technical controls satisfy framework requirements (SOC 2, ISO 27001, ISO 42001, FedRAMP).

Fieldguide is establishing a new state of trust for global commerce and capital markets through automating and streamlining the work of assurance and audit practitioners. They are based in San Francisco, CA, and built as a remote-first company that enables you to do your best work from anywhere.

Global

  • Evaluate AI-generated content and provide feedback.
  • Help AI better understand legal analysis and case law reasoning.
  • Contribute to improved drafting standards and legal communication.

Handshake is connecting students and early talent with employers. They offer flexible, hourly contract work to support AI research.

$230,000–$322,000/yr
US

  • Define technical strategy & architecture for data curriculum pipelines powering next-gen foundation models.
  • Design & execute dynamic curriculum learning strategies, improving model stability & reasoning.
  • Engineer logic for serializing Reddit’s complex conversational trees into optimal training contexts.

Reddit is a community-driven platform where users submit, vote, and comment on what interests them. With over 100,000 active communities and 116 million daily active users, they foster open conversations and shared interests.

$61,900–$105,300/yr
US

  • Implement features for AI applications such as conversational assistants and copilots and text generation, summarization, and content classification.
  • Design and optimize prompts and system instructions to improve task completion, reliability, and latency, minimize hallucinations and toxic/unsafe outputs and implement structured outputs.
  • Write unit, integration, and regression tests for AI features, run evaluation scripts and log results for model quality metrics, and work with AI observability tools under guidance.

RealPage is at the forefront of the Generative AI revolution, dedicated to shaping the future of artificial intelligence within the Property Tech domain. Our Agentic AI team is focused on driving innovation by building next generation AI applications and enhancing existing systems with Generative AI capabilities.

$53,766–$80,649/yr
Global

  • Partner with clients to deeply understand role requirements, organizational context, and culture.
  • Build pipelines through multiple channels-direct networks, similar organizations, community forums, conferences, referrals, and direct outreach.
  • Conduct initial screening calls to assess fit, motivation, and capabilities.

Outcapped is an operations agency that frees mission-driven founders from back-office grind. They are building a platform that will contain a library of 150+ vetted SOPs and an AI-powered workflow builder to standardize and automate finance, HR, legal, and IT tasks, saving teams up to 20 hours every week.

Global

  • Design and ship production-grade agentic AI systems that meaningfully improve customer workflows and internal operations.
  • Establish a clear technical architecture for AI at Moxie, including agent orchestration, tool/function calling and observability.
  • Integrate AI deeply into the Moxie platform, ensuring AI systems are secure, resilient, cost-aware, and aligned with a regulated environment.

Moxie empowers ambitious aesthetic entrepreneurs to build profitable, independent practices. They are a global, remote-first team of more than 140 people, supporting hundreds of practices nationwide, aiming to unlock sustainable success for aesthetic entrepreneurs.

Global

  • Review and validate Oracle SQL queries and output generated from existing natural language questions.
  • Ensure that SQL logic is correct, Oracle-compliant, and that it produces realistic, accurate results.
  • Verify that query outputs correctly and fully answer the original natural language questions and provide edits if needed.

CrowdGen, by Appen, provides AI annotation project. This role is a project-based opportunity where you will join as an Independent Contractor.

Europe 6w PTO

  • Scope and implement AI Agent deployments, providing strategic advice and execution support to customers and partners.
  • Leverage knowledge of LLM internals to analyze customer requirements and design precise prompts for reliable, user-aligned behavior.
  • Fine-tune conversational flows and voice output to align with customer brand standards.

Parloa is a fast-growing startup in the world of Generative AI and customer service. They have over 400 employees in Berlin, Munich, and New York and are expanding globally.

Global

  • Evaluate AI-generated content using your biological training.
  • Provide feedback to help AI better understand biological reasoning.
  • Work on a flexible, asynchronous schedule with no minimum hour requirement.

Handshake AI utilizes AI technology. They value expertise in biological reasoning, experimental design, data interpretation, and scientific problem-solving.

$190,000–$220,000/yr
US 3w PTO

  • Lead the research, design, and implementation of end-to-end machine learning solutions.
  • Own the evaluation and prompt engineering strategy for Large Language Models (LLMs).
  • Act as a strategic partner to Product and Engineering leaders.

OfferUp is dedicated to creating the simplest and most trusted way for people to buy, sell, and connect in their local communities. OfferUp was founded in 2011 and based in Bellevue, WA, and serves local markets nationwide.

Global

  • Design and evolve safety policies for audio AI, image/video AI and agentic safety.
  • Build scalable, AI-powered systems and workflows that dramatically reduce response times and increase policy coverage.
  • Drive cross-functional safety integration with product, engineering, legal, and operations teams.

ElevenLabs is an audio AI research and product company. They aim to make information accessible in any voice, language, or sound, and have raised a $180 million Series C round, valuing the company at $3.3 billion.

  • Design, build, and deploy the critical small language models that are foundational to Fastino’s product.
  • As an engineer, you will own the full lifecycle of our state of the art models, from prototyping and data analysis to deployment and monitoring.
  • Drive the data strategy to continuously improve model performance by analyzing distribution gaps and contributing to synthetic data pipelines.

Fastino is building the next generation of LLMs, with a team of alumni from Google Research, Apple, Stanford, and Cambridge. They have raised $25M through their seed round and are backed by leading investors including Microsoft, Khosla Ventures, and Insight Partners.

US Unlimited PTO

  • Design and implement AI-powered features, integrating LLMs with existing products.
  • Improve AI systems through evaluations, guardrails, monitoring, and customer usage.
  • Collaborate with AI Platform engineers to shape foundational AI systems and tooling.

Vanta helps businesses earn and prove trust by empowering companies to practice better security. They have a kind and talented team of employees determined to make security easier for companies to manage and prove.

AI Engineer

Ethos
$146,000–$236,000/yr
US

  • Own the LLM + retrieval + context layer that makes copilots accurate and fast.
  • Design and ship the end-to-end pipeline, improving quality and trust via evaluation.
  • Reduce cost/latency with a concrete inference optimization plan shipped to production.

Ethos is built to make it faster and easier to get life insurance. They blend industry expertise, technology, and the human touch to find the right policy to protect loved ones and have been named on CB Insights' Global Insurtech 50 list and BuiltIn's Top 100 Midsize Companies in San Francisco.