Write, iterate, and maintain system prompts and instruction sets for Noodle’s AI agents across the student journey.
Build and maintain evaluation frameworks to measure agent accuracy, tone, hallucination rate, task completion, and alignment with rubric-based learning objectives.
Partner with Noodle teammates and university stakeholders to design, build, and test agents — translating learning objectives, operational flows, rubric assessments, and more into prompt-level agent instructions.
Noodle is higher education’s leading strategy, services, and technology partner that develops infrastructure, provides life-changing learning experiences, and grows the awareness of and the enrollment in some of the best academic institutions in the world. They empower universities to change the world by offering university partners various products and services.
Interact with generative AI models using project-provided guidelines, safety taxonomies, and attack-vector guidance.
Create and evaluate prompts designed to test model behavior across safety-related categories.
Identify where model responses become unsafe, noncompliant, inconsistent, or otherwise problematic.
Welo Data is an AI services company that specializes in data annotation. They deliver multilingual content transformation services in translation, localization, and adaptation for over 250 languages with a growing network of over 400,000 in-country linguistic resources.
Design and build Claude skills, MCP integrations, and automated pipelines that transform internal knowledge into publication-ready docs with minimal manual intervention.
Act as the final reviewer for content produced by AI-assisted workflows and engineers, maintaining a high bar for technical accuracy and polish.
Define content structures and metadata standards that ensure our documentation is agent-consumable and machine-parseable.
Upsun, formerly Platform.sh, is the cloud application platform humans and robots love. They give developers, DevOps engineers, and platform teams the ability to build, ship, and scale confidently without wrestling with backend infrastructure.
Assess the factual accuracy, relevance, and quality of AI-generated Computer Science content
Craft and answer domain-specific questions related to Computer Science and adjacent technical disciplines
Evaluate and rank AI-generated responses based on technical correctness and reasoning quality
The company is seeking Computer Science Experts with PhDs to support the training and evaluation of advanced AI models. This initiative focuses on improving the accuracy, reasoning, and domain expertise of generative AI systems through expert human feedback.
Quickly iterate and develop proofs of concept to explore integrating AI into data and marketing workflows.
Make key decisions about the choice of AI architecture and frameworks.
Build production data agents to seamlessly answer analytics and data science questions.
Hightouch is an Agentic Marketing Platform that provides a composable CDP. They enable marketing teams to analyze performance, brainstorm ideas, and generate creative quickly. The team is ambitious and impact-driven, with a focus on humility, kindness, and compassion.
Interact with generative AI models and project guidelines.
Create prompts to test model behavior across safety categories.
Document model breakability and effort level.
Welo Data provides AI services and specializes in data annotation. We foster a collaborative and innovative culture where employees contribute to cutting-edge AI safety evaluation.
Conduct structured horizon-scanning for novel AI-enabled hazard classes.
Investigate next-generation detection sciences.
Analyze biosphere-scale vulnerabilities.
The Center for AI Risk Management & Alignment (CARMA) works to help society navigate the complex and potentially catastrophic risks arising from increasingly powerful AI systems. They focus on grounding AI risk management in rigorous analysis, developing policy frameworks that squarely address AGI, advancing technical safety approaches, and fostering global perspectives on durable safety.
Review and interpret financial reports, B2B data, or regulatory filings to verify information accuracy.
Respond to specific prompts based on financial data to help AI models understand technical terminology and complex fiscal concepts.
Ensure that the outputs generated by AI systems align with professional financial standards and logical economic frameworks.
Prolific is building the biggest pool of quality human data in the world. Over 35,000 AI developers, researchers, and organizations use Prolific to gather data from paid study participants with a wide variety of experiences, knowledge, and skills; they connect researchers and companies with a global pool of participants, enabling the collection of high-quality, ethically sourced human behavioural data and feedback.
Serves as the expert on civil engineering design for nuclear facilities.
Recommends engineering process and service improvements.
Leads functional teams, meeting milestones and objectives.
AtkinsRéalis is a world-class engineering services and nuclear organization. They create sustainable solutions that connect people, data and technology to transform the world's infrastructure and energy systems.
Design, build, and operate AI systems in production.
Build and maintain data pipelines that ingest, clean, transform, and version data.
Architect and build AI agent systems and orchestration layers.
RegScale is a continuous controls monitoring (CCM) platform that helps organizations automate and scale their security, risk, and compliance programs. They are transitioning from startup execution to a disciplined, enterprise ready engineering organization, and building the team that will take them there.
Closely coordinate with management, inspectors, and city agencies regarding the safety of compromised structures and take appropriate remedial actions.
Respond to emergencies/incidents, as needed, 24/7; assignments may require Emergency Response out of state and may necessitate more than one day on site.
Perform in-depth review of structural plans and advise on engineering issues during the construction of large or small, technically complex projects and existing buildings.
The Department of Buildings' Forensic Engineering Unit ensures public safety within New York City's five boroughs. They respond to emergencies and referrals, addressing engineering and construction problems in compromised buildings and structures, as well as issues of egress, zoning, and occupancy.
Partner with full-stack and backend engineers on the features they are shipping, write tests that prove it works, and flag gaps early.
Help build and run evaluation pipelines for non-deterministic LLM outputs, prompt regression, model drift detection, and output quality scoring across the LiteLLM routing layer.
Test the Nango-based integration layer across connectors and the file ingestion pipeline including encryption, formatting edge cases, and audit trail continuity.
Peach Pilot transforms how businesses run with a platform that ingests everything about how a company operates and constructs a Company Brain. It is a funded early-stage AI startup headquartered in Atlanta, Georgia, with a working platform on live infrastructure.
Help design the architecture of a system with multiple AI models, a set of backend APIs, and a frontend web application.
Write code within a small team, striking a reasonable balance between velocity and writing maintainable code.
Work with users and other team members to help define and refine product requirements, and translate them into a roadmap and code.
ConductorAI values candidates who can manage complexity and work independently. They are an equal opportunity employer using state-of-the-art tech to solve novel problems with mission partners.
Ship zero-to-one AI products end-to-end — from customer discovery and prototyping through production deployment and iteration
Build agentic AI systems — design and implement autonomous and semi-autonomous workflows using LLMs, tool-use, memory, and orchestration
Develop AI tools that improve efficiency across clinical operations, data extraction, manual workflows, and more
Natera is a global leader in cell-free DNA (cfDNA) testing, dedicated to oncology, women’s health, and organ health. The Natera team consists of statisticians, geneticists, doctors, scientists, business professionals, software engineers and other professionals from world-class institutions.
Design and execute AI safety testing protocols for clinical AI modules.
Review and validate AI-generated clinical content against evidence-based guidelines.
Partner with the Sales team as a clinical SME during demos, prospect calls, and pilot evaluations.
Avo is a clinical AI platform used by healthcare enterprises to infuse trusted knowledge into clinical copilots, integrated directly into the EHR. It helps clinicians deliver higher-quality care without the documentation burden and is in a growth stage.
Architect intelligent workflows for Large Interaction Models.
Define the future of GTM.
Product Genius is engineering AI-native growth systems. They architect intelligent workflows for Large Interaction Models and values builders over managers.
Own the end-to-end systems that generate and process restaurant imagery and video at scale.
Build a style system that creates brand-appropriate outputs across restaurant types.
Go deep with models and prompting to push quality, consistency, and creative range.
Owner provides an AI-native system local business owners use to succeed, starting with restaurants, replacing multiple tools with one. Their team is in the low hundreds, and they attract top talent from companies like Shopify, HubSpot, and Stripe, scaling rapidly to keep pace with customer growth.
Be responsible for the end-to-end technical migration workflow for transitioning templates to LLM autoraters.
Utilize Automatic Prompt Generation (APG) tools to create baseline prompts for complex parent-child template clusters.
Manually draft, test, and refine prompts to navigate complex template architectures, overcome anti-patterns, and handle edge cases.
Welo Data specializes in AI services and data validation. The company's culture emphasizes innovation, with a focus on freelance and remote work opportunities, offering flexibility and a global perspective.
Conduct role-play sessions as a clinically informed persona across multi-turn conversations
Evaluate AI responses for safety, empathy, clinical appropriateness, and risk handling
Provide structured written feedback grounded in clinical reasoning
Vetto is a global platform connecting top-tier professionals to strategic AI projects. They aim to build trust, quality, and long-term value within the AI ecosystem for both talent and companies at the forefront of technology.
Consult clients during presales to assess AI readiness and translate visions into actionable requirements.
Architect multi-agent frameworks, design AI systems with defined roles, and implement learning & feedback loops.
Develop RAG pipelines, design custom models, and ensure governance, security, and cost-efficiency.
Sigma Software is seeking a Senior/Principal AI Engineer to join their Stellar AdTech Business Unit. They deliver innovative systems to global AdTech leaders and startups since 2008, with a strong AdTech competence center of 300+ employees.