Source Job

20 jobs similar to Mathematics Specialist (Fluent in German) – AI Trainer

Jobs ranked by similarity.

US

  • Translate informal mathematical proofs into Lean (and related proof systems) with an emphasis on clarity, structure, and correctness.
  • Analyze generic and domain-specific proofs, identifying gaps, hidden assumptions, and formalizable sub-structures.
  • Construct formalizations that test the limits of existing proof assistants—especially where tools struggle or fail.

Alignerr partners with leading AI labs to build expert-driven workflows that improve model reasoning. They recruit top mathematicians and specialists to solve tasks where automated tools fail, advancing AI reliability, formalization, and high-integrity dataset creation.

Global

  • Completing AI training tasks such as analyzing, editing, and writing in Mandarin
  • Judging the performance of AI in performing Mandarin prompts
  • Improving cutting-edge AI models

Prolific is building the biggest pool of quality human data in the world and is not just another player in the AI space. Over 35,000 AI developers, researchers, and organizations use Prolific to gather data from paid study participants with a wide variety of experiences, knowledge, and skills.

$25–$30/hr
Global

  • Evaluate AI-generated French speech and text for linguistic accuracy, naturalness, and educational quality.
  • Assess learner speech and writing across proficiency levels from CEFR Pre-A1 through B2+.
  • Apply expert judgment to identify learner errors, unnatural phrasing, and pedagogical gaps.

Alignerr partners with leading AI labs to build expert-driven data pipelines. They improve how models reason, learn, and communicate by working with domain specialists to evaluate and refine AI systems where precision, pedagogy, and human judgment matter most.

Europe

  • Recording spoken versions of technical scripts in fluent, natural-sounding German (Germany)
  • Delivering clean, well-paced audio with high quality voice performance
  • Maintaining consistency in tone, clarity, and pronunciation

Upwork has partnered with an enterprise client who is a leader in the data-centric AI space, providing a platform used to develop intelligent applications. Their technology enables teams to apply the right balance of human supervision and automation to AI development.

Egypt

  • Design scenario-based and edge-case prompts to test AI behavior.
  • Develop evaluation rubrics to assess AI responses across multiple criteria.
  • Perform side-by-side evaluations of AI outputs and score them using defined criteria.

Welo Data, part of Welocalize, is a global AI data company with 500,000+ contributors delivering high-quality, ethical data to train the world’s most advanced AI systems. They are building smarter, more human AI with a diverse community in 100+ countries.

Europe

  • Evaluate AI-generated responses for accuracy, grammar, and cultural relevance.
  • Identify issues and provide refined, high-quality rewritten responses.
  • Create natural prompts and responses in Spanish to improve conversational datasets.

Welo Data, part of Welocalize, is a global AI data company with 500,000+ contributors delivering high-quality, ethical data to train the world’s most advanced AI systems. They're building smarter, more human AI with a diverse community in 100+ countries.

  • Challenge advanced language models on realistic infrastructure and platform scenarios.
  • Verify architectural soundness and logical correctness, assess code quality and testing strategies.
  • Analyze performance bottlenecks and deployment risks, capture reproducible failure cases, and suggest improvements.

The company is hiring for a SWE Infrastructure Specialist. As a contractor, the employee will need to supply a secure computer and high-speed internet; company-sponsored benefits such as health insurance and PTO do not apply.

Indonesia

  • Review short, pre-segmented datasets.
  • Evaluate model-generated replies based on Tone or Fluency .
  • Read a user prompt and two model replies, then rate each using a five-point scale.

CrowdGen, by Appen, focuses on AI response evaluation. They are looking for native Javanese speakers to contribute to a multilingual AI response evaluation project where you review large language model outputs.

$30–$75/hr
US

  • Train and refine Grok for voice interactions across diverse languages.
  • Curate and annotate high-quality audio data to enhance Grok's global accessibility.
  • Collaborate with technical staff to improve AI's handling of multilingual audio nuances.

xAI aims to create AI systems that understand the universe and aid humanity. The team is small, motivated, and focused on engineering excellence with a flat organizational structure, expecting all employees to be hands-on.

Europe

  • The Career Success Coach (CSC) will play a key role in ensuring the success of learners in Correlation One’s world-class data training and jobs programs.
  • The CSC will work alongside a team of Teaching Assistants and Correlation One program operations staff to provide professional development coaching support to a cohort of ~60 learners.
  • Candidates must be fluent in German and English and passionate about workforce development in the AI and machine learning ecosystem.

Correlation One develops the workforce’s skills for the AI economy. They work with enterprises and governments to develop talent and close critical data, digital, and technology skills gaps. Their global programs empower underrepresented communities and accelerate careers.

US

  • Evaluate LLM models for areas of finance where models do not perform well.
  • Leverage your experience in finance to help AI learn about how to build models, conduct financial analyses, etc.
  • Create rubrics to assess model capabilities on specific areas of your finance expertise.

The client is one of the world's fastest-growing AI companies accelerating the advancement and deployment of powerful AI systems. They help customers in two ways: Working with the world’s leading AI labs to advance frontier model capabilities and leveraging that work to build real-world AI systems that solve mission-critical priorities for companies.

$80,000–$150,000/yr

  • Research, Document, Test, and Ideate: Explore the best ways to achieve our customers’ goals using LLMs and other AI tools.
  • Master Our Dialogue Platform: Become an expert, answer questions, and train others on prompting both within and outside of our platform.
  • Train Our AIs: Utilize prompting, knowledge-base creation, and fine-tuning to enhance our AI capabilities.

1mind is a platform that deploys multimodal Superhumans for revenue teams, combining a face, a voice, and a GTM brain. The company has a remote-first, fast-moving culture with ownership, autonomy, and impact from day one.

North America

  • Design, refine, and evaluate prompts, context, and system instructions for various product use cases
  • Conduct experiments to assess model behavior, accuracy, and cost impact with new or existing prompts
  • Continuously improve prompt engineering processes by adopting new techniques and technologies

Applied Systems transforms the insurance industry. They have 40+ years of experience and are building a team ready to learn and deliver innovative software and services.

Global

  • Evaluate AI-generated Japanese speech and text for linguistic accuracy, naturalness, and educational quality.
  • Assess learner speech and writing across proficiency levels from CEFR Pre-A1 through B2+.
  • Apply expert judgment to identify learner errors, unnatural phrasing, and pedagogical gaps.

Alignerr partners with leading AI labs to build expert-driven data pipelines that improve how models reason, learn, and communicate. They work with domain specialists around the world to evaluate and refine AI systems in areas where precision, pedagogy, and human judgment matter most.

Global

  • Completing AI training tasks such as analyzing, editing, and writing computer science–related content
  • Judging the performance of AI on programming, algorithms, data structures, and systems prompts
  • Improving cutting-edge AI models using your understanding of software engineering and computational thinking

Prolific is building the biggest pool of quality human data in the world. Over 35,000 AI developers, researchers, and organizations use Prolific to gather data from paid study participants with a wide variety of experiences, knowledge, and skills.

Global

  • Data collection, evaluation, and annotation.
  • Pairwise comparisons.
  • Object tagging and labeling across different content types (audio, video, images, or collected data)

RWS enhances communication and delivers value. They embrace DEI and promote equal opportunity, committed to a work environment free of discrimination and harassment.

US Canada New Zealand UK Australia

  • Develop, solve, and review advanced material science problems with real-world relevance.
  • Apply expertise in semiconductor materials, molecular modeling, or related areas to design complex problem statements.
  • Collaborate asynchronously with AI researchers and domain experts to enhance AI model reasoning.

Alignerr partners with the world’s leading AI research teams and labs to build and train cutting-edge AI models. They are looking for a Material Science Expert to join their team.

Europe

  • Design, optimize, and version prompts for production voice and chat LLM applications.
  • Architect and orchestrate multi-agent systems for complex conversations.
  • Build automated testing and validation frameworks for LLM outputs.

Tuotempo transforms healthcare experiences through intelligent digital solutions and is a trusted patient engagement platform powering some of Europe and Latin America's leading healthcare institutions. They have a remote-first culture with vibrant hubs in Bologna or Barcelona.

$80,000–$150,000/yr
US

  • Research, Document, Test, and Ideate: Explore the best ways to achieve our customers’ goals using LLMs and other AI tools.
  • Master Our Dialogue Platform: Become an expert, answer questions, and train others on prompting both within and outside of our platform.
  • Train Our AIs: Utilize prompting, knowledge-base creation, and fine-tuning to enhance our AI capabilities.

1mind is a platform that deploys multimodal Superhumans for revenue teams. These Superhumans combine a face, a voice, and a GTM brain — equipped with deep technical and product knowledge. They seem to have a remote-first, fast-moving culture with ownership, autonomy, and impact from day one.

Global

  • Evaluate AI model outputs related to your field.
  • Assess content relevant to your area of expertise.
  • Deliver clear feedback to improve the model's comprehension.

Handshake is recruiting College Career/Technical Education Professors to contribute to an hourly, temporary AI research project. In this program, you’ll leverage your professional experience to evaluate what AI models produce in your field.