Challenge AI models on realistic educational scenarios.
Validate whether its understanding of pedagogical concepts reflects best-in-class teaching practice.
Evaluate AI outputs for clarity and correctness, analyze subtle reasoning errors, document gaps in logic.
The company is seeking independent Instructional Experts with hands-on experience teaching, tutoring, or building curriculum to train AI models. As a contractor you’ll supply a secure computer and high-speed internet; company-sponsored benefits such as health insurance and PTO do not apply.
Deep Research Insights: Collaborate with the Research Team to produce technical marketing material and research blogs.
Own the Knowledge Base: Gather information, write, and review content for the LILT Knowledge Base.
Liaise with GTM to produce content for technical customers, buyers, product leaders, and AI leaders.
LILT is on a mission to make the world’s information accessible to everyone, regardless of the language they speak. They use cutting-edge AI, machine translation, and human-in-the-loop expertise to translate content faster, more accurately, and more cost-effectively.
Design job-related prompts and review AI-generated responses.
Evaluate for quality, creativity, clarity, voice, and relevance.
Provide feedback to help AI understand creative writing techniques.
Handshake is connecting students with job opportunities. They run an AI program year-round, with projects opening periodically across different areas of expertise.
Contribute to an hourly, temporary AI research project, no AI experience needed.
Evaluate what AI models produce in your field, assess content, and deliver clear feedback.
Work independently from anywhere with flexible hours and no minimum commitment.
Handshake is connecting students with early talent and companies. They are helping students get hired and are on a mission to close the opportunity gap.
Instrumental in shaping AI interactions and experiences by designing clear, trustworthy content.
Lead significant AI initiatives, ensuring effective integration of language and interaction frameworks.
Collaborate to turn complex AI functionalities into user-friendly experiences, impacting customer interaction.
Jobgether uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Jobgether identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company.
Engage the model with investment scenarios, analytical questions, and market-based reasoning tasks; verify factual correctness and financial logic.
Assess the validity of investment reasoning; capture reproducible error traces; and provide structured feedback to improve prompts, evaluation frameworks, and analytical depth.
Identify where models oversimplify market behavior or misinterpret financial data.
They are evolving large-scale language models from simple conversational tools into systems capable of analyzing financial markets, interpreting investment strategies, and supporting decision-making across asset classes. They seem to have a growing team.
Participate in 15–60 minute recorded conversations.
Collaborate with the Data Operations team.
Contribute to high-quality conversational datasets.
Neon collaborates with prominent AI labs and tech companies to create premium conversational voice datasets, fostering advancements in speech and conversational AI. They seem to be a smaller company focusing on specialized data solutions.
You will be matched with another participant for 1-on-1 verbal or text-based exchanges.
Use your natural Dutch from Netherlands dialect to discuss various topics provided by the researcher.
Help the AI understand the nuances, slang, and cultural context of Dutch from the Netherlands, through real-world interaction.
Prolific is building the biggest pool of quality human data in the world. Over 35,000 AI developers, researchers, and organizations use Prolific to gather data from paid study participants with a wide variety of experiences, knowledge, and skills.
Use financial analysis, modeling, and advisory experience to evaluate AI content.
Provide feedback to help AI understand financial concepts.
Work independently on a flexible schedule with no minimum hour requirement.
Handshake is connecting students and employers. Through Handshake, finance professionals help AI to better understand financial concepts, quantitative reasoning, industry terminology, and professional communication.
Converse with the model on language scenarios, verify factual accuracy and logical soundness.
Capture reproducible error traces and suggest improvements to our prompt engineering and evaluation metrics.
Challenge advanced language models on topics like verb conjugation, noun-adjective agreement, sentence structure, word order, accentuation, and colloquial expressions.
They are evolving large-scale language models from clever chatbots into powerful engines of linguistic discovery. This project needs your expertise to help power the next generation of AI with high‑quality training data, tomorrow’s AI that can democratize world‑class education.
Migrate and test existing bulk flashcard creation prompts.
Run test suites and manually review AI outputs for quality and correctness.
Analyze real user data to identify failure patterns and improve prompts.
Brainscape is the world's leading web & mobile EdTech study platform. They help millions of learners create better flashcards and the company is looking for an AI Prompt Engineer to join their team.
Evaluate AI-generated content using your biological training.
Provide feedback to help AI better understand biological reasoning.
Work on a flexible, asynchronous schedule with no minimum hour requirement.
Handshake AI utilizes AI technology. They value expertise in biological reasoning, experimental design, data interpretation, and scientific problem-solving.
CentralReach is a leading provider of autism and IDD care software for Applied Behavior Analysis (ABA), multidisciplinary therapy, and special education. With over 200,000 users and backed by Roper Technologies, Inc., they are entering an exciting phase of growth and innovation.
Build with AI tools like Claude Code, Cursor, or GitHub Copilot to change how you approach every problem.
Work across the full stack contributing where needed, from debugging integrations to designing agent workflows or refining React components.
Shape our AI-assisted development practices, evolving context engineering and discovering new ways to leverage AI tools as they rapidly evolve.
RWS is focused on growing the value of ideas, data, and content by ensuring organizations are understood. The Product & Technology team, with over 500 staff, establishes unified standards and governance practices throughout the company, overseeing the development and maintenance of core applications.
Airtable is the no-code app platform that empowers people closest to the work to accelerate their most critical business processes. More than 500,000 organizations, including 80% of the Fortune 100, rely on Airtable to transform how work gets done.
Contribute to AI model training initiatives by curating code examples, offering precise solutions, and providing meticulous corrections in specialized programming languages.
Evaluate and refine AI-generated code, ensuring it adheres to industry standards for efficiency, scalability, and reliability.
Collaborate with cross-functional teams to enhance AI-driven coding solutions, ensuring they meet enterprise-level quality and performance benchmarks.
xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Their team is small, highly motivated, and focused on engineering excellence with a flat organizational structure.