Source Job

Global

  • Develop and operate a system that combines ontologies, knowledge graphs, defeasible argumentation frameworks, and LLM-assisted population pipelines.
  • Implement defeasible argumentation frameworks that capture both logical structure and vulnerability to rebuttal.
  • Architect agent coordination patterns for multi-step research and population tasks, with robust error handling and graceful degradation.

LLM AI Safety

16 jobs similar to AI Safety Argumentation Platform Research Engineer

Jobs ranked by similarity.

$3,850–$3,850/yr
US UK Canada

  • Fellows will use external infrastructure to work on an empirical project aligned with research priorities.
  • Projects aim to produce a public output, such as a paper submission.
  • Fellows receive mentorship and can access a shared workspace in Berkeley or London.

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. Their team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

US

  • Interact with generative AI models and project guidelines.
  • Create prompts to test model behavior across safety categories.
  • Document model breakability and effort level.

Welo Data provides AI services and specializes in data annotation. We foster a collaborative and innovative culture where employees contribute to cutting-edge AI safety evaluation.

US

  • Interact with generative AI models using project-provided guidelines, safety taxonomies, and attack-vector guidance.
  • Create and evaluate prompts designed to test model behavior across safety-related categories.
  • Identify where model responses become unsafe, noncompliant, inconsistent, or otherwise problematic.

Welo Data is an AI services company that specializes in data annotation. They deliver multilingual content transformation services in translation, localization, and adaptation for over 250 languages with a growing network of over 400,000 in-country linguistic resources.

$115,000–$130,000/yr
US 4w PTO

  • Write, iterate, and maintain system prompts and instruction sets for Noodle’s AI agents across the student journey.
  • Build and maintain evaluation frameworks to measure agent accuracy, tone, hallucination rate, task completion, and alignment with rubric-based learning objectives.
  • Partner with Noodle teammates and university stakeholders to design, build, and test agents — translating learning objectives, operational flows, rubric assessments, and more into prompt-level agent instructions.

Noodle is higher education’s leading strategy, services, and technology partner that develops infrastructure, provides life-changing learning experiences, and grows the awareness of and the enrollment in some of the best academic institutions in the world. They empower universities to change the world by offering university partners various products and services.

$155,000–$215,000/yr
Global

  • Conduct structured horizon-scanning for novel AI-enabled hazard classes.
  • Investigate next-generation detection sciences.
  • Analyze biosphere-scale vulnerabilities.

The Center for AI Risk Management & Alignment (CARMA) works to help society navigate the complex and potentially catastrophic risks arising from increasingly powerful AI systems. They focus on grounding AI risk management in rigorous analysis, developing policy frameworks that squarely address AGI, advancing technical safety approaches, and fostering global perspectives on durable safety.

$180,000–$240,000/yr
US

  • Be part of the alignment research team, working on projects selected for their high upside potential and under-resourced status.
  • Do real alignment research with real autonomy, in directions most organizations aren’t set up to pursue.
  • Break complex problems into concrete experiments and execute on them, independently or with a team.

AE Studio is a 160-person, fully bootstrapped ML consultancy that has spent over a decade building and shipping AI systems for clients. Without outside investors, we put money into alignment research through the AI Alignment Foundation, a nonprofit we founded to scale this work.

$160,000–$240,000/yr
US

  • Build agentic AI systems that change how Dataiku runs internally.
  • Turn real problems into working software.
  • See your solutions through from first conversation to production.

Dataiku is the Platform for AI Success, the enterprise orchestration layer for building, deploying, and governing AI. The world’s leading companies rely on Dataiku to operationalize AI and run it as a true business performance engine delivering measurable value.

$228,000–$342,000/yr
US

  • Define research direction in Agentic AI and LLMs.
  • Advance the state of the art in agentic systems, including retrieval, grounding, memory, context, personalization, etc.
  • Foundational Model Research: Advance Workday’s proprietary capabilities in pre-training, post-training (RLHF, DPO), and domain-specific alignment for HR and Finance workflows.

Workday is a Fortune 500 company and a leading AI platform for managing people, money, and agents. Their culture is rooted in integrity, empathy, and shared enthusiasm.

$45–$45/hr
US Canada

  • Evaluate and improve model safety: Label, rank, audit, and refine human- and model-generated text to improve safety, quality, and policy alignment.
  • Apply nuanced safety judgment: Assess model outputs against detailed safety guidelines, rubrics, and style standards, making consistent decisions across ambiguous, sensitive, and context-dependent cases.
  • Create prompts and safety test cases: Write realistic prompts, user scenarios, and adversarial examples that help evaluate model behavior across safety categories and uncover unsafe, evasive, over-refusing, or policy-inconsistent responses.

Cohere's mission is to scale intelligence to serve humanity by training and deploying frontier models for developers and enterprises. They are a team of researchers, engineers, and designers passionate about their craft, believing that a diverse range of perspectives is a requirement for building great products.

Latam Unlimited PTO

  • Design and build production-grade RAG or LLM-based systems.
  • Architect AI systems end-to-end, considering AI guardrails and governance.
  • Implement validation layers and response verification mechanisms.

Cresteo aims to be a leader in people-first nearshore tech services, valuing a people-centered approach. They are built on expertise, transparency, profit-sharing, and innovation, fostering a culture where software development is seen as a human sport and art form.

India

  • Design and ship agentic systems and multi-step LLM workflows using Claude, OpenAI, or equivalent - including tool use, memory, structured output extraction, and failure handling.
  • Build and maintain MCP integrations connecting internal tools, portco systems, and external data sources into reliable, observable pipelines.
  • Write production-grade Python for data pipelines, integration scripts, and scheduled jobs running via BullMQ-backed queues on the Node/TypeScript stack.

Emergence is a PE holdco backed by the Pritzker Organization focused on acquiring and scaling B2B SaaS businesses. It combines operational rigor with a growth equity mindset to drive ARR growth and profitability across its portfolio.

$194,000–$228,000/yr
US

  • Design, build, and ship LLM-powered features and agentic workflows for Gametime users.
  • Build and maintain evaluation frameworks and prompt testing pipelines for AI-powered experiences.
  • Contribute to orchestration layer, including agent routing, tool use, and multi-step workflow coordination.

Gametime helps people connect through shared live experiences. They operate platforms on iOS, Android, mobile web, and desktop, supporting over 60,000 events across the US and Canada, fostering a collaborative and inclusive environment where diverse perspectives are valued.

AI Engineer

Zinier
India

  • Design pragmatic solutions for real problems, assessing each use case and selecting the right approach.
  • Rapid prototyping and iterative delivery, shipping functional prototypes within days and validating value with real users.
  • Build agentic AI systems where justified, designing and implementing multi-agent architectures and LLM-based tooling.

Zinier empowers frontline workers to achieve greater things. They are a remote-first, global team headquartered in Silicon Valley with a hybrid workforce across the United States, Canada, Europe, Latin America, Singapore, and Bangalore, India.

$230,000–$280,000/yr
US Unlimited PTO

  • Lead the implementation, monitoring, and continuous improvement of security, governance, and trust controls for AI systems.
  • Define trustworthy and untrustworthy AI behavior and ensure it is measurable in production for security event analysis.
  • Translate governance principles into technical and operational requirements that product and platform teams can adopt.

YipitData is a market research and analytics firm for the disruptive economy. They analyze billions of alternative data points daily, providing insights on various markets and are backed by The Carlyle Group and Norwest Venture Partners.

$91,250–$127,750/yr
Canada

  • Develop AI systems that automate dispute and chargeback handling using structured evidence and business logic, creating a better experience for our customers.
  • Build models that automate refunds, getting money back to our customers faster.
  • Build and maintain evidence extraction pipelines that process unstructured data using LLM-powered workflows to produce structured, actionable outputs.

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. They are a remote-first company with competitive benefits and focus on an inclusive interview experience.

$165,000–$210,000/yr
US

  • Evaluate and refine AI prototypes built by business units to enhance commercial ROI, security, and architecture.
  • Refactor high-value internal AI prototypes into secure, scalable, enterprise-grade applications.
  • Build and maintain secure LLM integrations with internal systems like data lakes and Salesforce, ensuring full-lifecycle maintenance of applications.

Impiricus is an AI-powered HCP Engagement Engine that ethically connects healthcare professionals to pharmaceutical resources to reduce go-to-market costs and accelerate patient access to treatments. It is a fast-growing company with a unique network of HCPs and advisors, fostering a collaborative and impactful culture where employees can work flexibly.