Model Performance Engineer

Fathom

Benefits

Inference Performance:

  • Make our models faster and cheaper with techniques like speculative decoding and quantization.
  • Optimize serving configurations, GPU selection, and batching strategies.
  • Address spiky traffic patterns from meeting schedules.

Fine-tuning Pipelines:

  • Build repeatable infrastructure for AI Engineers to deploy models faster.
  • Create pipelines to train adapters for domain-specific behavior.
  • Distill large teacher models for classification tasks.

How You’ll Help Us Win:

  • Discover performance bottlenecks across GPU families.
  • Evaluate serving frameworks with speculative decoding.
  • Optimize GPU spend based on workload characteristics.

Fathom

Fathom eliminates the needless overhead of meetings with an AI assistant that captures, summarizes, and organizes key moments. They are a small company that creates magical experiences through focused builders and values a supportive environment.

Apply for This Position