Architect and build agentic workflows that combine large language models, reasoning components, and data pipelines to create adaptive, goal-driven conversational systems
Lead the design and development of advanced ML/NLP products, from ideation to production - including model training, evaluation, optimization, and deployment
Drive experimentation with new approaches for agentic reasoning, coordination, and autonomous system design
SmartRecruiters is the Recruiting AI Company that transforms hiring for the world’s leading enterprises. Built for global scale, SmartRecruiters, an SAP company, delivers an AI-powered hiring platform that automates and optimizes the entire talent acquisition process, ensuring faster and smarter hiring decisions. They are a values-driven, globally focused tech company with strong financial backing and a bold vision for the future of work.
Build and ship AI-powered product features using LLMs and generative models
Develop and maintain services and APIs around ML models
Integrate AI models into production systems and user-facing applications
Social Discovery Group (SDG) is the 3rd largest social discovery company in the world, uniting 60+ brands with 500 million users. They transform virtual intimacy into the new normal by solving the problems of loneliness, isolation, and disconnection. Their international team of 1200 professionals and digital nomads works all over the world.
Design, develop, and refine large language model workflows to steer and improve model behaviors.
Build language processing components for intent detection, summarization and conversational response quality.
Drive R&D-style exploration on cutting-edge speech and language systems, rapidly prototyping novel approaches.
Cresta's platform combines AI and human intelligence to help contact centers discover customer insights and behavioral best practices, automate conversations, and empower team members. They are led by founders with experience at Google, Waymo, and Open AI, and are on a mission to revolutionize the workforce with AI.
Design, implement, and evaluate machine learning models and AI algorithms.
Develop and optimize prompts for LLMs to improve model outputs.
Collaborate with software engineers, data scientists, and product teams.
Cadre AI is focused on building and optimizing AI-powered platforms, bringing together cutting-edge technologies and expertise in machine learning and large language models. The team is dedicated to advancing AI capabilities and applying them to real-world challenges through scalable, high-impact solutions.
Design, implement and validate high-reliability, distributed platforms for machine learning, natural language processing, and LLMs.
Create, debug, interpret and improve production machine learning and natural language processing models.
Build the tools and validation processes that help Counterpart translate insights into action at scale.
Counterpart Health transforms healthcare and improves patient care with its innovative primary care tool, Counterpart Assistant. They are a subsidiary of Clover Health, with an exceptional team of value-based care and technology experts, driving value-based care at the speed of software.
Design and develop an AI-powered productivity analytics platform.
Build scalable LLM pipelines and create a meta-workflow system.
Develop system-level prompt engineering and build an evaluation framework for AI output quality control.
Appflame is a Ukrainian product-driven tech company committed to building world-class products. They have 500+ team members and offices in Kyiv, London, Limassol, and a co-working hub in Warsaw; they value bold, driven people who are passionate about building real products.
Act as a Player/Coach: Architect the system and mentor the team, but spend significant time hands-on in the codebase (Python/PyTorch).
Drive our strategy for SFT (Supervised Fine-Tuning) and RLHF/DPO (Preference Optimization).
Build the immune system of the platform. You will design and train custom classifiers to detect and filter non-consensual or illegal content within an explicit environment.
EverAI is building the future of AI companionship, creating entirely new categories in the AI world. With 50 million users, the company is composed of approximately 75 enthusiastic and hardworking individuals, backed by a founding team with experience in scaling web products.
Contribute to designing, evaluating, and shipping our mental health AI Agent and its supporting infrastructure.
Develop and maintain robust data pipelines to power model training and evaluation.
Partner with AI Research, Product, and Engineering teams to define new features.
Sword Health is shifting healthcare from human-first to AI-first through its AI Care platform. They aim to make world-class healthcare available anytime, anywhere, while significantly reducing costs. Backed by clinical studies and patents, Sword Health has raised more than $500 million from leading investors.
Design and implement AI-powered systems using a mix of classical ML techniques and modern LLM-based approaches.
Apply a range of techniques—from classical ML to LLM-based approaches with a strong focus on reliability, performance, and maintainability.
Collaborate closely with product managers and designers to deliver high-quality, customer-focused features.
Optro is a leading audit, risk, ESG, and InfoSec platform, exceeding $300M ARR and experiencing continuous growth. They empower over 50% of the Fortune 500 with their award-winning technology, fostering innovation and customer satisfaction, and are recognized as one of North America's fastest-growing tech companies.
Vetto is a global talent platform connecting top-tier professionals to high-impact AI projects around the world. Their mission is to build trust, quality, and long-term value in the AI ecosystem - for both exceptional talents and companies operating at the frontier of technology.
Design and implement evaluation systems and tooling to validate Oura’s custom AI models and Advisor
Develop novel evaluation methods to measure grounding, reliability, and actionability of LLM and agentic systems
Build and optimize custom AI models through fine-tuning, knowledge distillation, and quantization
Oura's mission is to empower every person to own their inner potential. Their award-winning products help their global community gain a deeper knowledge of their readiness, activity, and sleep quality by using their Oura Ring and its connected app. They are focused on helping people live healthier and happier lives, and ensure that their team members have what they need to do their best work — both in and out of the office.
Design and build the evaluation infrastructure that ensures the platform's AI systems produce accurate, well-sourced, high-quality responses
Build automated test suites that validate answer quality across agent pipeline changes
Develop regression detection systems that catch quality degradation before it reaches users
IDC is building the next generation of AI-powered intelligence platforms that transform how technology decisions get made. As the premier global provider of trusted technology intelligence, IDC equips business and technology leaders with the evidence they need to make confident decisions.
Work with designers and product managers to create high-performing product features.
Apply ML techniques to LLM-based approaches with a strong focus on reliability, performance, and maintainability.
Optro is the leading audit, risk, ESG, and InfoSec platform on the market and has surpassed $300M ARR. They inspire each other to innovate and assist each other to create the most loved platform, which has allowed them to become one of the 500 fastest-growing tech companies in North America.
Design and implement scalable ML infrastructure to support model development and deployment
Develop and maintain evaluation frameworks for Large Language Models (LLMs), including RAG-based systems
Evaluate model performance using tools such as RAGAS, DeepEval, or similar frameworks
EX Squared LATAM collaborates with global clients to build innovative digital solutions that drive real business impact. They foster a collaborative, inclusive, and innovation-driven culture where continuous learning and professional growth are at the core of everything they do.
Scope and lead ML initiatives end-to-end from identifying opportunities through production deployment.
Design, develop, and optimize ML models and AI systems for document processing and automation.
Build and maintain production ML pipelines that are robust, observable, and scalable.
Medallion is a healthcare technology company building a provider operations platform to eliminate administrative bottlenecks. They are one of the fastest-growing healthcare technology companies, with a mission to transform healthcare at scale and are backed by $130M in funding.
Own agent quality end-to-end: diagnosis, improvement, and validation across SmartAssist's orchestrator and subagents
Drive quality improvements through prompt engineering, context engineering, and RAG retrieval tuning
Extend and mature our evaluation framework: scorers, golden datasets, regression gates, and online evaluation for production traffic
Smartsheet has been helping people and teams achieve for over 20 years. They are building tools that empower teams to automate the manual, uncover insights, and scale smarter.
Developing ranking and recommendation models that identify high-performing team designs.
Building brandification pipelines to conform to an organisation's brand guidelines.
Building layout extraction and understanding systems that parse Canva's design format.
Canva is a design platform that makes it easy for anyone to create professional-looking designs. They have a flagship campus in Sydney, a second campus in Melbourne, and co-working spaces in Brisbane, Perth, & Adelaide, and provides flexibility in how and where you work.
Design and optimise AI-ready tools and APIs that enable LLM platforms to reliably interact with Canva's design capabilities.
Build and maintain evaluation frameworks to systematically measure tool-use accuracy across platforms.
Experiment with LLM orchestration and agent architectures – Develop Canva agents that any 3rd party provider can call to design quickly, efficiently and at scale.
Canva is a platform redefining how the world experiences design. They have a flagship campus in Sydney, with a second campus in Melbourne and co-working spaces in Brisbane, Perth, Adelaide, and Auckland, NZ.
Drive the design and evolution of AI-ready tools and APIs for LLM platforms.
Own and evolve evaluation frameworks that measure tool-use accuracy across platforms.
Shape Canva's agent architecture, making strategic technical decisions about intelligence location.
Canva is a design platform that enables users to create various visual content. They have offices in multiple locations in Australia and New Zealand, and they offer a flexible work environment.
Be responsible for the end-to-end technical migration workflow for transitioning templates to LLM autoraters.
Use client’s internal tools to leverage prompt engineering techniques to maximize model performance.
Solve edge-case scenarios by designing and refining manual prompts.
Welo Data provides AI services and focuses on data validation. Although the job posting does not say anything about the size of employees or culture, they seem like a fast growing company.