Own the ML serving API and deploy models to production with CI/CD and infrastructure as code.
Build monitoring, alerting, and reliability for NBA models and LLM agents.
Drive architectural decisions and mentor engineers on MLOps patterns.
Clutch is a vertical SaaS company backed by Andreessen Horowitz, revolutionizing how credit unions engage with members via fintech lending software. The company is small and ambitious, with a lean data team of five that values pragmatism and fast shipping.
Design and operate core AI platform components for training, deploying, and serving ML models at scale.
Own model serving and inference workflows end-to-end, optimizing for reliability, latency, throughput, and cost.
Collaborate with product, infrastructure, and security teams to build scalable platform capabilities for AI-powered features.
Mozilla Corporation is the non-profit-backed technology company behind Firefox and Pocket, with over 225 million monthly users. A wholly-owned subsidiary of the Mozilla Foundation, the company is mission-driven, employee-owned, and focused on privacy and open standards.
Build and operate the ML lifecycle platform, including tooling for experiment tracking, model registry, and versioned pipelines.
Own CI/CD and deployment for ML workloads, building automated pipelines from notebook to production.
Make models observable and reliable in production with monitoring for latency, drift, data quality, and cost signals.
dv01 provides a data analytics platform for the structured finance market, offering transparency into investment performance and risk for lenders and Wall Street investors. With over 400 clients and coverage of over 100 million loans, dv01 is a data-first company with a diverse and innovative culture.
Collaborate with data scientists and engineers to build scalable ML pipelines, troubleshoot infrastructure issues from Linux to Kubernetes, and optimize model performance.
Drive high engineering standards, design on-premises MLOps solutions, and maintain tools for deployment and monitoring.
Refine CI/CD workflows, incorporate ML model training and evaluation into testing, and ensure seamless handover between research and production.
Learneo is a platform of builder-driven businesses, including Course Hero, CliffsNotes, LitCharts, Quillbot, Symbolab, and Scribbr, focused on supercharging productivity and learning. The company supports high-growth businesses with centralized corporate operations and has a virtual-first culture with employees across multiple countries.
Build and operate production-grade model serving infrastructure using vLLM, TGI, or Triton frameworks.
Design and implement auto-scaling, multi-model architectures, and intelligent request routing for ML inference.
Optimize GPU utilization, memory efficiency, and observability to ensure low-latency, cost-effective systems.
They are a distributed cloud infrastructure startup building AI-native cloud services with GPU-powered compute. The company is well-funded, fast-scaling, and operates in a remote-first environment with a focus on sustainability and decentralization.
Build and operate the real-time inference service for the risk decision engine with low latency and high availability.
Own model deployment infrastructure including CI/CD, shadow mode, and staged rollouts.
Build model observability and partner with Risk Data Science for production operation.
Mercury is a fintech company that provides banking services for startups via partner banks. The company is committed to creating a safe environment and values diversity, with a growing team focused on innovation.
Design, build, and deploy AI/ML solutions from prototype to production for client business problems.
Apply generative AI and LLMs, establishing MLOps best practices including CI/CD and model monitoring.
Serve as a trusted technical advisor, translating ambiguous problems into well-scoped solutions and presenting to stakeholders.
DevIQ builds modern cloud and data solutions for mid-market companies focused on energy reduction, healthcare, education, and smart cities. The company offers competitive benefits, a strong team culture, and opportunities to work on end-to-end solutions with multi-disciplinary teams.
Assess current pipelines and data architecture to produce a prioritized plan for change.
Design durable data and ML systems grounded in customer needs with documented tradeoffs.
Harden pipelines, upgrade data architecture, and raise standards for observability and reliability.
FutureFit AI's core mission is to help more people get to better jobs faster and cheaper, with a focus on those facing barriers to opportunity. Their team of 30-50 across the US and Canada fosters a high trust, high intensity culture with a will to win.
Design and build systems that improve the efficiency of ML training and inference workloads.
Develop tooling that helps ML engineers debug, profile, optimize, and monitor model performance.
Partner with ML researchers and product teams to identify bottlenecks and drive performance improvements.
Reddit is a community of communities built on shared interests, passion, and trust, hosting the most open and authentic conversations on the internet. With over 100,000 active communities and approximately 126 million daily active users, Reddit is one of the internet's largest sources of information.
Lead operational excellence, reliability, and support of enterprise AI and data platforms, ensuring stability, scalability, and observability.
Design and implement automation, monitoring, and operational tooling for AI/ML platforms including Palantir Foundry, AWS Bedrock, and SageMaker.
Serve as a senior escalation point for complex production issues, driving root cause analysis and improving platform stability.
CSAA Insurance Group, a AAA insurer, offers personal lines of property and casualty insurance to AAA members across 23 states and DC. Founded in 1914, they are one of the top personal lines insurers in the US with over 3,800 employees, known for a values-based culture and recognition in leadership development and community involvement.
Own reliability, latency, and performance for AI platform services and data infrastructure on AWS.
Design and maintain CI/CD pipelines, infrastructure-as-code, and observability frameworks across the stack.
Partner with AI and data engineers to ensure secure, cost-optimized, and scalable deployment of platform components.
HHAeXchange is the leading technology platform for home and community-based care, providing an end-to-end homecare solution for people who are aging or have disabilities. Founded in 2008, the company is passionate about transforming healthcare by connecting patients, providers, managed care organizations, and states.
Collaborate with data scientists and software engineers to build scalable data pipelines and ML deployment systems.
Troubleshoot issues across the ML infrastructure stack, from Linux and Docker to Kubernetes and model serving.
Drive high engineering standards through code reviews, testing, and CI/CD enhancements.
Quillbot helps students and professionals strengthen their writing with AI-powered tools. We serve over 56 million users globally and foster a collaborative, virtual-first culture.
Design and build scalable ML training, deployment, and inference pipelines using CI/CD and cloud infrastructure.
Implement MLOps for model versioning, monitoring, and automated retraining to detect drift and performance degradation.
Partner with Data Scientists and Product teams to productionise models and integrate ML into customer-facing products.
We develop solutions that make an impact for companies around the globe. Our culture embraces openness, acts with respect, shows grit & guts, and combines employment with enjoyment.
Develop and operate production-ready AI and machine learning systems for enterprise-scale products.
Build and optimize LLM-powered applications, RAG pipelines, and intelligent agents.
Implement software engineering best practices for AI development including CI/CD and testing.
Our partner is building enterprise-grade AI solutions that deliver measurable business impact. They offer a remote-friendly work environment with a collaborative engineering culture focused on innovation, quality, and continuous learning.
Lead applied ML initiatives for identity verification, focusing on computer vision models like face liveness detection and anti-spoofing.
Build, train, and optimize deep learning models and pipelines on AWS with strong reproducibility and monitoring.
Collaborate across teams to ensure ML solutions meet privacy, compliance, and reliability requirements.
Mitek is a global leader in digital and biometric identity authentication, fraud prevention, and mobile deposit solutions, serving over 7,500 organizations worldwide. The company is headquartered in San Diego with operations across multiple countries and emphasizes a Virtual 1st culture, valuing flexibility and inclusion.
Design and develop machine learning solutions ensuring accuracy, performance, security, and scalability
Implement and maintain end-to-end AI/ML pipelines from data ingestion to deployment
Collaborate across planning, design, and code review to raise overall code quality
We shape the future of communications from remote-first environments. We deliver innovative solutions to hundreds of thousands of businesses and empower millions of developers worldwide, with a strong culture of connection and inclusion.
Own availability, latency, and throughput SLOs across a large fleet of generative media model APIs serving production traffic at scale.
Build monitoring, alerting, and observability to catch ML-specific failures, output quality degradation, and model regressions before customers do.
Harden model deployment workflows with canary releases, shadow testing, automated rollbacks, and validation gates to ship new model versions safely.
Fal is the generative media ecosystem powering the next generation of AI products, providing infrastructure, tools, and model access for developers and enterprises. As a unified platform for high-performance inference, orchestration, and observability, fal is becoming the ecosystem ambitious teams build on in a market projected to grow by hundreds of billions over the next decade.
Design and develop machine learning models for localization workflows, including machine translation and LLM finetuning.
Implement and optimize models using Python, TensorFlow, and deploy via Docker and AWS services.
Evaluate and select ML techniques, perform statistical analysis, and maintain clear documentation.
Welo Global is a leader in multilingual AI, technology, and content solutions serving over 2,000 clients in 300 languages. The company combines globally scaled multilingual infrastructure with a network of over 500,000 linguists and domain experts, backed by seven ISO certifications.
Take ownership of the ML API serving NBA recommendations and harden it for low-latency production traffic.
Ship your first agent tool contract end-to-end: schema design, handler implementation, and unit tests.
Set up the eval foundation for agents with golden transcripts, rubric-based judges, and regression suites.
Clutch is a vertical SaaS company backed by Andreessen Horowitz that helps credit unions become fintech lenders, providing affordable lending solutions to over 130 million Americans. The team is small, ambitious, and shipping fast with a culture that values pragmatism and real autonomy.
Own and scale AI compute and deployment platforms including Kubernetes and GitOps pipelines.
Build inference infrastructure and observability stacks for LLM-powered workflows.
Drive security, compliance, and governance at the systems level in a regulated healthcare environment.
Hims & Hers is a leading health and wellness platform focused on making healthcare accessible and personal. As a publicly traded company on the NYSE (HIMS), it offers flexible/remote work and a culture centered on innovation and employee well-being.