Contribute to Project Spearmint, a multilingual AI response evaluation project, by reviewing large language model (LLM) outputs in Swedish, focusing on Tone or Fluency. Native-level fluency in Swedish and strong English comprehension are required to assess model-generated replies based on specific quality dimensions and validate evaluation frameworks. Help establish baseline quality metrics for future model development.
Job listings
USD/year
USD/year
Evaluate machine translations by assessing text and assigning semantic similarity scores. Evaluate the accuracy, fluency, and overall quality of online messages that have been translated by machines. The texts you'll be working with are similar to what you'd find in online conversations or on social media.
Join Welo Data's global contributor network and be first in line for flexible, remote projects evaluating and improving AI. Projects involve annotating, evaluating, and creating prompts. Receive clear details on tasks, timelines, and compensation before each project, allowing you to choose what works best for you.