Develop and operate a system that combines ontologies, knowledge graphs, defeasible argumentation frameworks, and LLM-assisted population pipelines.
Implement defeasible argumentation frameworks that capture both logical structure and vulnerability to rebuttal.
Architect agent coordination patterns for multi-step research and population tasks, with robust error handling and graceful degradation.
CARMA works to help society navigate the complex and potentially catastrophic risks arising from increasingly powerful AI systems. They are a fiscally-sponsored project of Social & Environmental Entrepreneurs, Inc., a 501(c)(3) nonprofit public benefit corporation with a mission to lower the risks to humanity and the biosphere from transformative AI.
Conduct structured horizon-scanning for novel AI-enabled hazard classes.
Investigate next-generation detection sciences.
Analyze biosphere-scale vulnerabilities.
The Center for AI Risk Management & Alignment (CARMA) works to help society navigate the complex and potentially catastrophic risks arising from increasingly powerful AI systems. They focus on grounding AI risk management in rigorous analysis, developing policy frameworks that squarely address AGI, advancing technical safety approaches, and fostering global perspectives on durable safety.
Interact with generative AI models and project guidelines.
Create prompts to test model behavior across safety categories.
Document model breakability and effort level.
Welo Data provides AI services and specializes in data annotation. We foster a collaborative and innovative culture where employees contribute to cutting-edge AI safety evaluation.
Be part of the alignment research team, working on projects selected for their high upside potential and under-resourced status.
Do real alignment research with real autonomy, in directions most organizations aren’t set up to pursue.
Break complex problems into concrete experiments and execute on them, independently or with a team.
AE Studio is a 160-person, fully bootstrapped ML consultancy that has spent over a decade building and shipping AI systems for clients. Without outside investors, we put money into alignment research through the AI Alignment Foundation, a nonprofit we founded to scale this work.
Assess the factual accuracy, relevance, and quality of AI-generated Computer Science content
Craft and answer domain-specific questions related to Computer Science and adjacent technical disciplines
Evaluate and rank AI-generated responses based on technical correctness and reasoning quality
The company is seeking Computer Science Experts with PhDs to support the training and evaluation of advanced AI models. This initiative focuses on improving the accuracy, reasoning, and domain expertise of generative AI systems through expert human feedback.
Interact with generative AI models using project-provided guidelines, safety taxonomies, and attack-vector guidance.
Create and evaluate prompts designed to test model behavior across safety-related categories.
Identify where model responses become unsafe, noncompliant, inconsistent, or otherwise problematic.
Welo Data is an AI services company that specializes in data annotation. They deliver multilingual content transformation services in translation, localization, and adaptation for over 250 languages with a growing network of over 400,000 in-country linguistic resources.
Lead the measurement strategy for user safety, quantifying safety experiences across Reddit.
Use data-backed methods to inform strategic direction of Safety product development.
Design and execute experiments to estimate the impact and ROI of safety initiatives.
Reddit is a community-based platform built on shared interests and authentic conversations. With over 100,000 active communities and millions of daily active users, it fosters information sharing and discussions across diverse topics.
Advance the state of the art in agentic systems, including retrieval, grounding, memory, context, personalization, etc.
Foundational Model Research: Advance Workday’s proprietary capabilities in pre-training, post-training (RLHF, DPO), and domain-specific alignment for HR and Finance workflows.
Workday is a Fortune 500 company and a leading AI platform for managing people, money, and agents. Their culture is rooted in integrity, empathy, and shared enthusiasm.
Define, implement, and maintain the AI security strategy across Deel's infrastructure and product ecosystem.
Lead security assessments and threat modeling for AI/ML models, LLM integrations, and agentic AI systems.
Evaluate and deploy AI Security Posture Management (AISPM) and AI Detection & Response (AIDR) solutions.
Deel is the all-in-one payroll and HR platform for global teams with a vision to unlock global opportunity. They are among the largest globally distributed companies with a team of 7,000 spanning more than 100 countries with a connected and dynamic culture.
Lead the implementation, monitoring, and continuous improvement of security, governance, and trust controls for AI systems.
Define trustworthy and untrustworthy AI behavior and ensure it is measurable in production for security event analysis.
Translate governance principles into technical and operational requirements that product and platform teams can adopt.
YipitData is a market research and analytics firm for the disruptive economy. They analyze billions of alternative data points daily, providing insights on various markets and are backed by The Carlyle Group and Norwest Venture Partners.
Drive the strategy, development, and deployment of machine learning models that detect and prevent harmful content and behavior at scale.
Lead and grow a high-performing Safety ML organization, including applied research, model development, productionization, and continuous improvement.
Develop and deploy cutting-edge Safety ML systems (including fine-tuned LLMs and transformer models) that outperform state-of-the-art solutions in quality, latency, and efficiency.
Reddit is a community of communities built on shared interests, passion, and trust, and is home to conversations on the internet. With 100,000+ active communities and approximately 126 million daily active unique visitors, Reddit is one of the internet’s largest sources of information.
Design, build, and ship LLM-powered features and agentic workflows for Gametime users.
Build and maintain evaluation frameworks and prompt testing pipelines for AI-powered experiences.
Contribute to orchestration layer, including agent routing, tool use, and multi-step workflow coordination.
Gametime helps people connect through shared live experiences. They operate platforms on iOS, Android, mobile web, and desktop, supporting over 60,000 events across the US and Canada, fostering a collaborative and inclusive environment where diverse perspectives are valued.
Develop AI systems that automate dispute and chargeback handling using structured evidence and business logic, creating a better experience for our customers.
Build models that automate refunds, getting money back to our customers faster.
Build and maintain evidence extraction pipelines that process unstructured data using LLM-powered workflows to produce structured, actionable outputs.
Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. They are a remote-first company with competitive benefits and focus on an inclusive interview experience.
Evaluate and improve model safety: Label, rank, audit, and refine human- and model-generated text to improve safety, quality, and policy alignment.
Apply nuanced safety judgment: Assess model outputs against detailed safety guidelines, rubrics, and style standards, making consistent decisions across ambiguous, sensitive, and context-dependent cases.
Create prompts and safety test cases: Write realistic prompts, user scenarios, and adversarial examples that help evaluate model behavior across safety categories and uncover unsafe, evasive, over-refusing, or policy-inconsistent responses.
Cohere's mission is to scale intelligence to serve humanity by training and deploying frontier models for developers and enterprises. They are a team of researchers, engineers, and designers passionate about their craft, believing that a diverse range of perspectives is a requirement for building great products.
Pick up live work across data ingestion, knowledge graph integration, and the application layer.
Contribute to the front-end and runtime layer that surfaces AI agent activity, recommendations, and human-in-the-loop governance to client users.
Move freely between Python backend, TypeScript frontend, and infrastructure work as the build demands.
Peach Pilot builds a platform that ingests everything about how a company operates and constructs a Company Brain: a living knowledge graph that connects people, decisions, and outcomes across the entire organization. They are co-founded by Mario Montag and JP James and have a working platform with live infrastructure and a proven data-to-insights methodology.
Identify high-leverage opportunities where AI improves customer outcomes, not where it’s trendy.
Ship working prototypes, not slide decks; move in days, not quarters.
Define success metrics tied to customer value, and then hold yourself to them.
Clipboard's mission is to uplift as many communities as possible through an app-based marketplace connecting healthcare professionals with workplaces. Founded in 2016, they are a remote-first team of over 1,000 people and a top Y-Combinator company, profitable since 2022.
Own technical direction for high-impact AI products.
Work across teams to turn big ideas into shipped systems.
Help raise the bar for how we build, evaluate, and operate AI in production.
Rula is dedicated to treating the whole person, not just the symptoms, and aims to create a world where mental health is no longer stigmatized. They are a remote-first company that hires in most U.S. states and are passionate about making a positive impact on mental healthcare.