Conduct experiments with LLMs and evaluate different architectures and techniques to improve conversational AI quality.
Develop and maintain robust evaluation frameworks to assess model performance, accuracy, and user satisfaction using offline and online metrics.
Optimize models for inference, improving speed, efficiency, and scalability for production environments.
Social Discovery Group (SDG) unites millions of users on dozens of products, solving loneliness, isolation, and disconnection by transforming virtual intimacy into the new normal. Their international team of 1000+ professionals works remotely from various locations, and they've been recognized as a "Great Place to Work".
Architect and build agentic workflows that combine large language models, reasoning components, and data pipelines to create adaptive, goal-driven conversational systems
Lead the design and development of advanced ML/NLP products, from ideation to production - including model training, evaluation, optimization, and deployment
Drive experimentation with new approaches for agentic reasoning, coordination, and autonomous system design
SmartRecruiters is the Recruiting AI Company that transforms hiring for the world’s leading enterprises. Built for global scale, SmartRecruiters, an SAP company, delivers an AI-powered hiring platform that automates and optimizes the entire talent acquisition process, ensuring faster and smarter hiring decisions. They are a values-driven, globally focused tech company with strong financial backing and a bold vision for the future of work.
Developing ranking and recommendation models that identify high-performing team designs.
Building brandification pipelines to conform to an organisation's brand guidelines.
Building layout extraction and understanding systems that parse Canva's design format.
Canva is a design platform that makes it easy for anyone to create professional-looking designs. They have a flagship campus in Sydney, a second campus in Melbourne, and co-working spaces in Brisbane, Perth, & Adelaide, and provides flexibility in how and where you work.
Design, build, and iterate on machine learning models and LLM-based systems that power critical decisions across fraud, compliance, growth, and operations
Work with messy, real-world data to identify signals, build features, and continuously improve model performance
Make practical tradeoffs between model performance, interpretability, and operational cost
River is building the world’s most trusted financial institution to empower people to take ownership of their financial lives through Bitcoin. River is growing quickly and has raised more than $50 million from leading investors.
Design, train, evaluate, and deploy ML systems powering real-time voice experiences including ASR, speech understanding, turn detection, and text-to-speech.
Improve voice AI quality through error analysis, data curation, metric design, and iterative model improvement with a focus on real-world performance.
Build evaluation frameworks for complex voice systems, measuring accuracy, robustness, latency, naturalness, and task completion.
Cresta turns customer conversations into a competitive advantage for contact centers by combining AI and human intelligence to uncover insights, automate processes, and empower teams. It is a Stanford AI lab-born company led by industry pioneers like Sebastian Thrun and offers a dynamic, mission-driven culture focused on revolutionizing the workforce with AI.
Vetto is a global talent platform connecting top-tier professionals to high-impact AI projects around the world. Their mission is to build trust, quality, and long-term value in the AI ecosystem - for both exceptional talents and companies operating at the frontier of technology.
Design and implement scalable ML infrastructure to support model development and deployment
Develop and maintain evaluation frameworks for Large Language Models (LLMs), including RAG-based systems
Evaluate model performance using tools such as RAGAS, DeepEval, or similar frameworks
EX Squared LATAM collaborates with global clients to build innovative digital solutions that drive real business impact. They foster a collaborative, inclusive, and innovation-driven culture where continuous learning and professional growth are at the core of everything they do.
Act as a Player/Coach: Architect the system and mentor the team, but spend significant time hands-on in the codebase (Python/PyTorch).
Drive our strategy for SFT (Supervised Fine-Tuning) and RLHF/DPO (Preference Optimization).
Build the immune system of the platform. You will design and train custom classifiers to detect and filter non-consensual or illegal content within an explicit environment.
EverAI is building the future of AI companionship, creating entirely new categories in the AI world. With 50 million users, the company is composed of approximately 75 enthusiastic and hardworking individuals, backed by a founding team with experience in scaling web products.
Scope and lead ML initiatives end-to-end from identifying opportunities through production deployment.
Design, develop, and optimize ML models and AI systems for document processing and automation.
Build and maintain production ML pipelines that are robust, observable, and scalable.
Medallion is a healthcare technology company building a provider operations platform to eliminate administrative bottlenecks. They are one of the fastest-growing healthcare technology companies, with a mission to transform healthcare at scale and are backed by $130M in funding.
You will personally own the Intelligence Engine -- Scoring & Activation and the behavioral scoring algorithms that power user-facing activation across B2C and B2B.
You will own the entire ML Pipeline Architecture, from data ingestion and feature stores to model training, evaluation, deployment, monitoring, and retraining triggers.
You will be responsible for LLM Integration & Optimization, integrating, fine-tuning, and deploying large language models for contextual inference, personalization, and behavioral pattern recognition.
Gesture is a fast-growing tech company using AI, machine learning, and intelligent logistics to power a unique platform that connects people and brands through real-world, tangible experiences. Inside their NYC headquarters, you'll find an environment that moves with the pace and precision of Silicon Valley but with the heart of something far greater.
Lead the design, development, and deployment of production, multi-turn LLM-powered features.
Own backend services in Python that integrate LLM agents with Fullscript’s platform and support reliable production use.
Partner with medical, product, and engineering teams to identify high-value opportunities for AI and turn them into practical, scalable product capabilities.
Fullscript is a health technology company committed to helping people get better by creating a platform that powers every part of care. More than 125,000 practitioners use Fullscript for clinical insights, lab interpretations, patient analytics, education, and access to high-quality supplements.
Design features connecting natural language queries with a large corpus of legal knowledge.
Build a data architecture you are proud to highlight.
Use unstructured data to build large scale data sets.
Trellis Law is the leading provider of state trial court data in the U.S. They leverage AI and Machine Learning to analyze hundreds of millions of state trial court documents, transforming complex data into actionable insights. Founded in 2018, Trellis has experienced rapid growth and is now trusted by many of the nation’s largest law firms and corporate legal teams.