Source Job

North America

  • Architect the migration of the existing compiler flow into MLIR, defining dialects, passes, and lowering strategies.
  • Build conversion paths between MLIR and Mythic’s custom low-level IR to keep both flows operational during migration.
  • Define validation infrastructure within MLIR, including interpretation or execution paths for simulation and debugging.

C++ Python PyTorch

20 jobs similar to Compiler Engineer – MLIR / PyTorch Infrastructure

Jobs ranked by similarity.

$295,000–$350,000/yr
US Canada

  • Enhance compiler performance, investigating bottlenecks and collaborating with teams to improve build times.
  • Maintain and expand support for multiple operating systems, including Windows and Android.
  • Drive innovations in Swift and C/C++, Java/Kotlin interoperability, enabling seamless integration for Windows and Android.

The Browser Company is building a better way to use the internet with a focus on browsers. They are a remote-first, distributed team of close to 100 people (and growing!).

$159,925–$222,230/yr
Canada

  • Build prototypes and POCs that showcase Tailscale for AI agents and tooling.
  • Work with reference customers to integrate Tailscale, both for internal adoption and for embedding into their products to enable secure customer connectivity.
  • Create reference architectures and share your work through documentation, open source, community engagement, and conference presentations.

Tailscale is building a new Internet by delivering software that makes it easy to securely interconnect people and their devices, no matter where they are. They are a fully distributed company, and teams of every size use Tailscale each day to protect their networks and share access to internal tools.

India

  • Design and ship agentic systems and multi-step LLM workflows using Claude, OpenAI, or equivalent - including tool use, memory, structured output extraction, and failure handling.
  • Build and maintain MCP integrations connecting internal tools, portco systems, and external data sources into reliable, observable pipelines.
  • Write production-grade Python for data pipelines, integration scripts, and scheduled jobs running via BullMQ-backed queues on the Node/TypeScript stack.

Emergence is a PE holdco backed by the Pritzker Organization focused on acquiring and scaling B2B SaaS businesses. It combines operational rigor with a growth equity mindset to drive ARR growth and profitability across its portfolio.

$150,000–$170,000/yr
US

  • Design, implement, and maintain reliable, scalable, and secure infrastructure, applications, and tooling, with a focus on our ML/AI pipelines and workloads
  • Write clean, maintainable code, and perform peer code-reviews
  • Write clear and concise documentation and engage in cross-team communication and knowledge sharing

Bright Machines is a next-generation, AI-enabled manufacturer focused on data center infrastructure assembly operations. The company utilizes AI-based robotics and software to assemble AI infrastructure hardware products for hyperscalers and leading OEMs, employing under 500 employees, with a culture rooted in innovation and expertise.

APAC

  • Partner directly with customer engineering teams running training and inference workloads in production.
  • Investigate failures involving distributed training, Kubernetes orchestration, GPU allocation, networking, and storage systems.
  • Identify recurring patterns across customer issues and drive long term reliability improvements.

Lightning AI is the company behind PyTorch Lightning, building an end-to-end platform for developing, training, and deploying AI systems. They serve solo researchers, startups, and large enterprises, operating globally with offices in New York City, San Francisco, Seattle, and London.

India

  • Design end-to-end AI integration architectures connecting LLM APIs, vector databases, and inference systems to existing backend infrastructure.
  • Build reusable ML infrastructure components like feature pipelines, model serving layers, and evaluation frameworks that multiple portfolio companies standardize on.
  • Establish AI system integration best practices and governance patterns that become repeatable playbooks across the holding company.

Emergence is a thematic holding company backed by the Pritzker Organization focused exclusively on acquiring and scaling category-defining software businesses. They invest in focused portfolios, specialized operating groups with deep domain expertise and proven playbooks.

Global

  • Build and maintain a unified C++17 library that runs seamlessly across iOS, Android, and low-power automotive embedded hardware.
  • Analyze and improve map-matching and dead-reckoning algorithms using real-world data from millions of vehicles.
  • Implement route-following features, including high-frequency route progress reporting, deviation detection, and timely instruction delivery.

Mapbox is the leading real-time location platform for a new generation of location-aware businesses. They equip organizations with the full set of tools to power the navigation of people, packages, and vehicles everywhere. They value high-performing creative individuals who dig into problems and opportunities, and they emphasize an environment of teaching and learning.

AI Engineer

Zinier
India

  • Design pragmatic solutions for real problems, assessing each use case and selecting the right approach.
  • Rapid prototyping and iterative delivery, shipping functional prototypes within days and validating value with real users.
  • Build agentic AI systems where justified, designing and implementing multi-agent architectures and LLM-based tooling.

Zinier empowers frontline workers to achieve greater things. They are a remote-first, global team headquartered in Silicon Valley with a hybrid workforce across the United States, Canada, Europe, Latin America, Singapore, and Bangalore, India.

$194,000–$228,000/yr
US

  • Design, build, and ship LLM-powered features and agentic workflows for Gametime users.
  • Build and maintain evaluation frameworks and prompt testing pipelines for AI-powered experiences.
  • Contribute to orchestration layer, including agent routing, tool use, and multi-step workflow coordination.

Gametime helps people connect through shared live experiences. They operate platforms on iOS, Android, mobile web, and desktop, supporting over 60,000 events across the US and Canada, fostering a collaborative and inclusive environment where diverse perspectives are valued.

US

  • Owns the technical direction for large-scale machine learning models, guiding the development of advanced deep learning architectures and high-impact ML systems.
  • Partners with leadership to define ML roadmaps, drive innovation in scalable model design and training approaches.
  • Ensures efficient, reliable deployment of ML models in production and mentors the team’s technical capabilities.

Reddit is a community-driven platform where users submit, vote, and comment on topics of interest. With over 100,000 active communities and approximately 126 million daily active unique visitors, it is one of the internet’s largest sources of information.

$180,000–$250,000/yr
US

  • Build our core Python/Rust platform: request routing, AI workload orchestration, scheduling, GPU autoscaling, large scale file storage, queueing, etc
  • Produce forward designs for platform evolution as we scale to 100x current traffic and need to provide low latency across the world
  • Leverage AI to an extreme level to automate the mundane parts of building complex but reliable systems

Fal is building the infrastructure, tools, and model access to move from AI idea to production. They aim to be the unified platform where high-performance inference, orchestration, and observability come together to unlock new categories of AI-native products.

$160,000–$215,000/yr
US

  • Design, develop, and optimize algorithms for workflow automation, which include computer vision and computational geometry components.
  • Develop signal-processing and image-analysis algorithms using classical methods as well as modern AI/ML approaches, including neural networks.
  • Perform system-level analysis, simulation, and validation to ensure algorithm performance meets product requirements.

Cellanome is a well-funded start-up tackling some of the biggest challenges facing biology today. They have a world class team of engineers, scientists, team builders and problem solvers to develop the next generation technologies.

Global

  • Build and maintain our host provisioning stack to bring new bare metal online quickly and confidently.
  • Evolve our homegrown orchestration engine to manage clusters, containers, and VMs.
  • Build out internal observability and alerting so we catch fleet problems before customers feel them.

Railway's core mission is to make software engineers higher leverage. They provide powerful tools so engineers can spend less time setting up and more time doing. The team is small, with high ownership, and they are passionate about being exceptional.

LATAM

  • Design and build AI systems in production that solve real business problems, end-to-end: from discovery to operation.
  • Work with product and operations to translate ambiguous problems into measurable and maintainable solutions.
  • Build data pipelines that feed models and product surfaces.

Skydropx is innovating logistics with a team of visionary people who want to grow and change the world. They are integrating AI into their logistics platform for LATAM, working at a multi-tenant scale with hundreds of thousands of shipments per month.

$102,067–$116,648/yr
Canada

  • Build the application-layer foundations for AI integration at Roofr.
  • Design and implement reliable, scalable integrations with third-party services and APIs.
  • Build data pipelines that feed Roofr's AI systems.

Roofr is obsessed with its customers and constantly gathers feedback to shape its products. Roofr has an amazing culture, strong financials, and best-in-class company metrics and offers its team significant growth and equity.

US

  • Shape technical direction and architecture: Define the foundational architecture for enterprise agentic AI at Benchling.
  • Build and ship the early portfolio yourself: Write production code at least half your time, particularly during the team's first year.
  • Design for enterprise from day one: Build for multi-tenant isolation, secrets management, audit logging, payload encryption, role-based access controls, and human-in-the-loop controls calibrated to risk.

Benchling is the AI platform for biotech R&D. Scientists use Benchling to design experiments, capture structured data, and run AI agents and models directly in their workflows. They have over 200,000 scientists around the world, from academic labs to Sanofi and Moderna.

$95,482–$116,700/yr
US

  • Debug, troubleshoot, and optimize existing applications.
  • Analyze complex technical problems and propose effective solutions.
  • Foster a culture of learning and continuous improvement within the team by leveraging a broad skill set.

EarnIn pioneers in earned wage access, building products that deliver real-time financial flexibility. The company is growing fast with experienced leadership and world-class funding partners.

  • Design and develop deliverables within the designated area.
  • Conduct code reviews and contribute to continuous quality improvement.
  • Analyze faults, perform debugging, and implement fixes.

Tieto Tech Consulting solves clients’ toughest technology challenges and delivers reliable outcomes. It appears to be a large tech company in the Nordics with a growing R&D team, fostering diversity, equity, and inclusion.

  • Benchmark FP8 quantization across GPU families and ship a production config to achieve speedup.
  • Evaluate serving frameworks with speculative decoding to improve performance.
  • Build a fine-tuning pipeline to enable faster model training and deployment.

Fathom eliminates the needless overhead of meetings with an AI assistant that captures, summarizes, and organizes key moments. They are a small company that creates magical experiences through focused builders and values a supportive environment.

Global

  • Own and operate GPU and accelerator clusters for AI training, inference, and experimentation, ensuring reliability and cost-efficiency.
  • Build and optimize scheduling, orchestration, and serving systems using frameworks like vLLM and Triton to improve latency, throughput, and memory efficiency.
  • Partner with ML engineers to remove workflow bottlenecks and build observability for GPU utilization, capacity, and incident response.

Kraken is a crypto exchange platform building premium financial products for traders and institutions, accelerating global crypto adoption. It is a mission-driven, fully remote company with a world-class team of crypto experts spread across more than 70 countries.