Similar Jobs

See all

Role Responsibilities:

  • Design and operate highly scalable, fault-tolerant systems supporting production workloads across a distributed cloud environment.
  • Define and implement Service Level Objectives and error budgets to guide reliability decisions.
  • Automate operational processes to reduce manual toil and improve system consistency and resilience.

Team Collaboration:

  • Work closely with product and platform engineering teams to define and implement reliability standards.
  • Participate in incident response, on-call practices, and post-incident reviews, focusing on root cause analysis.
  • Advocate for a reliability-focused engineering culture, including blameless postmortems and operational excellence.

Qualifications and Experience:

  • 5+ years of experience in site reliability engineering, infrastructure, or related software engineering disciplines.
  • Strong experience operating and scaling distributed systems in cloud environments, with AWS preferred.
  • Proficiency with Infrastructure as Code tooling, such as Terraform, and deep understanding of system performance and reliability patterns.

Company Culture:

  • Operates as a values-based company with principles like being Fearless, Fast, Lovable, Owners, Win-win, and Inclusive.
  • Offers a remote-first environment that enables you to do your best work from anywhere, backed by top-tier investors.
  • Focuses on building an inclusive and supportive team dedicated to creating the future of business trust and audit software.

Fieldguide

Fieldguide is establishing a new state of trust for global commerce and capital markets through automating and streamlining the work of assurance and audit practitioners, specifically within cybersecurity, privacy, and financial audit. It is a remote-first, values-driven company backed by top investors, building an inclusive and supportive team to create the future of audit and advisory software.

Apply for This Position