Source Job

20 jobs similar to Red Teaming | Generative AI Analyst

Jobs ranked by similarity.

US

  • Interact with generative AI models using project-provided guidelines, safety taxonomies, and attack-vector guidance.
  • Create and evaluate prompts designed to test model behavior across safety-related categories.
  • Identify where model responses become unsafe, noncompliant, inconsistent, or otherwise problematic.

Welo Data is an AI services company that specializes in data annotation. They deliver multilingual content transformation services in translation, localization, and adaptation for over 250 languages with a growing network of over 400,000 in-country linguistic resources.

US

  • Perform annotation and labeling tasks for generative AI datasets, including text, image, video, and multimodal content.
  • Create, review, and evaluate prompts and responses across a variety of domains and use cases.
  • Conduct quality assurance reviews to ensure annotation accuracy, consistency, and adherence to guidelines.

Welo Data delivers multilingual content transformation services in translation, localization, and adaptation for over 250 languages. They drive innovation in language services, delivering high-quality training data transformation solutions for NLP-enabled machine learning, with a network of over 400,000 in-country linguistic resources.

$28–$28/hr
US

  • Review, refine, and validate AI translation prompts for attraction and travel content.
  • Optimize AI-generated translations to meet standards of naturalness and fluency.
  • Proofread bilingual descriptions for accuracy and readability, refining prompts as needed.

Welo Data provides AI services. We have an exciting community and are looking for collaborators.

$16–$20/hr
Europe

  • Review, refine, and validate AI translation prompts for attraction and travel content.
  • Optimize AI-generated translations to ensure naturalness, fluency, and cultural relevance.
  • Test language prompts to ensure the output meets the required standards.

Welo Data provides AI services. They focus on helping businesses leverage the power of artificial intelligence to improve their operations and create innovative solutions.

$15–$15/hr
US

  • Identify and label languages and dialects from model-generated responses.
  • Review outputs from two different AI models and determine which model correctly identified the proposed language.
  • Compare model responses and select the appropriate evaluation outcome from predefined options

RWS – TrainAI is looking for Language Data Annotators. They embrace DEI and promotes equal opportunity and prohibits discrimination and harassment of any kind.

$45–$45/hr
US Canada

  • Evaluate and improve model safety: Label, rank, audit, and refine human- and model-generated text to improve safety, quality, and policy alignment.
  • Apply nuanced safety judgment: Assess model outputs against detailed safety guidelines, rubrics, and style standards, making consistent decisions across ambiguous, sensitive, and context-dependent cases.
  • Create prompts and safety test cases: Write realistic prompts, user scenarios, and adversarial examples that help evaluate model behavior across safety categories and uncover unsafe, evasive, over-refusing, or policy-inconsistent responses.

Cohere's mission is to scale intelligence to serve humanity by training and deploying frontier models for developers and enterprises. They are a team of researchers, engineers, and designers passionate about their craft, believing that a diverse range of perspectives is a requirement for building great products.

$160,000–$240,000/yr
US

  • Build agentic AI systems that change how Dataiku runs internally.
  • Turn real problems into working software.
  • See your solutions through from first conversation to production.

Dataiku is the Platform for AI Success, the enterprise orchestration layer for building, deploying, and governing AI. The world’s leading companies rely on Dataiku to operationalize AI and run it as a true business performance engine delivering measurable value.

$115,000–$130,000/yr
US 4w PTO

  • Write, iterate, and maintain system prompts and instruction sets for Noodle’s AI agents across the student journey.
  • Build and maintain evaluation frameworks to measure agent accuracy, tone, hallucination rate, task completion, and alignment with rubric-based learning objectives.
  • Partner with Noodle teammates and university stakeholders to design, build, and test agents — translating learning objectives, operational flows, rubric assessments, and more into prompt-level agent instructions.

Noodle is higher education’s leading strategy, services, and technology partner that develops infrastructure, provides life-changing learning experiences, and grows the awareness of and the enrollment in some of the best academic institutions in the world. They empower universities to change the world by offering university partners various products and services.

$20–$20/hr
Europe

  • Review and refine AI translation prompts for attraction and travel content.
  • Optimize AI-generated translations to ensure naturalness and fluency.
  • Identify and flag issues that cannot be resolved through prompt tuning.

Welo Data provides AI services. It's a freelance-remote company that seems to value collaboration and contribution from its community members.

$12–$12/hr
Global

  • Review, refine, and validate AI translation prompts for attraction and travel content.
  • Optimize AI-generated translations, ensuring naturalness, fluency, and cultural relevance.
  • Review pre-written prompt instructions for tone, grammar, and proper name handling.

Welo Data is an AI services company. They focus on providing AI solutions to various industries.

Europe

  • Review pre-written prompt instructions for tone, grammar, proper name handling, and measurements.
  • Add language-specific grammar or stylistic notes to enhance prompt accuracy.
  • Translate product-specific terms and cross-check against approved glossaries.

Welo Data provides AI services. They focus on AI service general application.

$15–$15/hr
Europe

  • Review pre-written prompt instructions for tone and grammar.
  • Translate product-specific terms and cross-check against glossaries.
  • Run sample attraction descriptions through GPT-4.O, and refine prompts.

Welo Data is an AI services company. We focus on providing Language Specialists to review, refine, and validate AI translation prompts. The company appears to be a community where people can collaborate on exciting projects.

Global

  • Perform side-by-side (SBS) comparisons of AI-generated responses.
  • Evaluate outputs based on accuracy, relevance, clarity, and instruction-following.
  • Apply detailed, scenario-specific annotation guidelines and maintain consistency and high-quality evaluations.

Blueprint Technologies is a technology solutions firm headquartered in Bellevue, Washington, with a strong presence across the United States and an expanding footprint across Latin America (LATAM). Our people bring diverse perspectives, deep expertise, and real-world experience across industries to help organizations grow, transform, and innovate.

  • Evaluate outputs based on accuracy, relevance, clarity, and instruction-following.
  • Perform side-by-side (SBS) comparisons of AI-generated responses.
  • Identify nuances in tone, meaning, and cultural context across French.

Blueprint Technologies is a technology solutions firm headquartered in Bellevue, Washington, with a strong presence across the United States and an expanding footprint across Latin America (LATAM). They are united by a shared passion for solving complex problems and bring diverse perspectives, deep expertise, and real-world experience across industries to help organizations grow, transform, and innovate.

  • Review, refine, and validate AI translation prompts for attraction and travel content.
  • Optimize AI-generated translations to meet standards of naturalness and fluency.
  • Test and refine language prompts to ensure output meets cultural relevance.

Welo Data is an AI services company. We are looking for talented people to join our community and contribute to exciting projects related to artificial intelligence.

US Canada Mexico Australia New Zealand Argentina

  • Assess the factual accuracy, relevance, and quality of AI-generated Computer Science content
  • Craft and answer domain-specific questions related to Computer Science and adjacent technical disciplines
  • Evaluate and rank AI-generated responses based on technical correctness and reasoning quality

The company is seeking Computer Science Experts with PhDs to support the training and evaluation of advanced AI models. This initiative focuses on improving the accuracy, reasoning, and domain expertise of generative AI systems through expert human feedback.

Europe US 4w PTO

  • Continuously explore emerging shifts in AI interfaces, orchestration, agents, and autonomy through hands-on experimentation and ecosystem research.
  • Rapidly prototype, validate, and launch new AI-native product ideas with minimal support and high autonomy.
  • Use structured thinking, research, and experimentation to evaluate what n8n should invest in over the next 1–3 years.

N8n is the open workflow orchestration platform built for the new era of AI. They give technical teams the freedom of code with the speed of no-code, so they can automate faster, smarter, and without limits. Since their founding in 2019, they’ve grown into a diverse team of over 260 working across Europe and the US.

Australia Canada France Germany Spain US Unlimited PTO

  • Design and build Claude skills, MCP integrations, and automated pipelines that transform internal knowledge into publication-ready docs with minimal manual intervention.
  • Act as the final reviewer for content produced by AI-assisted workflows and engineers, maintaining a high bar for technical accuracy and polish.
  • Define content structures and metadata standards that ensure our documentation is agent-consumable and machine-parseable.

Upsun, formerly Platform.sh, is the cloud application platform humans and robots love. They give developers, DevOps engineers, and platform teams the ability to build, ship, and scale confidently without wrestling with backend infrastructure.

$3,850–$3,850/yr
US UK Canada

  • Fellows will use external infrastructure to work on an empirical project aligned with research priorities.
  • Projects aim to produce a public output, such as a paper submission.
  • Fellows receive mentorship and can access a shared workspace in Berkeley or London.

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. Their team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

  • Review, refine, and validate AI translation prompts for attraction and travel content.
  • Optimize AI-generated translations to meet the standards of naturalness, fluency, and cultural relevance.
  • Run sample attraction descriptions through GPT-4.O, evaluate output, and refine prompts as needed.

Welo Data provides AI Services. They seem to have a collaborative environment, emphasizing contributions to exciting projects.