Provide technical leadership for infrastructure, reliability, and observability.
Own the observability stack using Datadog and CloudWatch.
Design and evolve AWS infrastructure for reliability, security, scalability, and cost efficiency.
Topstep is an engaging working environment that ranges from fully remote to hybrid. They foster a culture of collaboration by keeping cameras on during meetings and maintaining a robust Slack environment for communication.
Lead the design, implementation, manage, support and operation of cloud-native infrastructure and container orchestration platforms.
Drive platform reliability, scalability, automation, and operational excellence across critical SaaS and cloud-based workloads.
Contribute to architectural decisions, mentoring engineers, and ensuring alignment with security, compliance, and operational standards.
Availity delivers revenue cycle and related business solutions for health care professionals who want to build healthy, thriving organizations. They are a global team with headquarters in Jacksonville, FL, and an office in Bangalore, India, united by a mission to bring the focus back to patient care.
Develop and maintain scalable automation and integrations across cloud platforms and services.
Design, implement, and operate CI/CD pipelines using Jenkins, Dagger, Terraform, and Docker.
Build, operate, and troubleshoot workloads on Kubernetes, using Kustomize and Helm.
People Inc. is America’s largest digital and print publisher. Our brands harness the best intent-driven content, the fastest sites, and the fewest ads to help nearly 200 million people every month make decisions.
Design, build, and operate reconciliation systems to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
Improve operational efficiency by reducing deployment complexity and contributing to the Stack Config Reconciliation project.
Grafana Labs is a remote-first, open-source powerhouse with over 20M users of Grafana. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).
Architect and maintain infrastructure as code with Terraform.
Set up monitoring, alerting, and incident response.
We're a UK fintech building high-throughput digital infrastructure for the mortgage and property space. Recently acquired Trussle and we are taking our platform to the next level. The company values innovation and building high-quality products.
Own the technical direction of Remote's SRE/Platform domain.
Define and drive the reliability strategy across the platform.
Identify and lead AI enablement initiatives across the engineering organisation.
Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. With our core values at heart and a future-focused work culture, our team works tirelessly on ambitious problems, asynchronously, around the world.
Drive the stability and reliability of Epic's GCP infrastructure.
Manage and harden our Docker and GKE container platform.
Maintain and improve CI/CD pipelines.
Epic is the leading digital reading platform for kids ages 12 and under, used by millions of children, families, and educators around the world. As Epic continues to grow, we are reimagining what reading can be through thoughtful technology, data, and global collaboration to make learning more engaging, accessible, and impactful.
Lead the design and evolution of Fieldguide's core platform services.
Build platform capabilities that compound the leverage of product and AI engineers.
Define the architecture for how new product capabilities get delivered across environments.
Fieldguide is automating and streamlining the work of assurance and audit practitioners. They are based in San Francisco, CA, remote-first and backed by Goldman Sachs Alternatives, Bessemer Venture Partners, 8VC, Floodgate, Y Combinator, and more.
Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.
Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.
Scale and mature Vesta’s infrastructure to support the entire mortgage market reliably, securely, and efficiently.
Build the foundational systems that power engineering velocity and platform reliability.
Focus on cloud architecture, deployment systems, observability, incident response, and internal developer tooling.
Vesta is building the next-generation system of record to power the multi-trillion mortgage market. They value humility, empathy, self-awareness, and an orientation towards action and have raised $45M from top tier investors.
Design, build, and maintain infrastructure using Infrastructure as Code tools such as Terraform.
Improve system reliability, scalability, resilience, and performance across the Mast platform.
Build systems and tooling that automate infrastructure management and operational workflows wherever possible.
Mast is on a mission to make complex lending simple by building modern, cloud-native lending technology purpose-built for specialist lenders. It is a high-performance team of engineers and lending experts that values radical honesty, transparency, and speed.
Design, build, and maintain scalable cloud infrastructure services in AWS and GCP.
Contribute production-quality Go and Python code to existing cloud services.
Develop and own automation and software deployment pipelines with maximum efficiency.
Dragos is dedicated to arming customers with best-in-class technology, threat intelligence, and services to protect their systems. They embody core values of authenticity, transparency, and trust and are a remote-first culture with operations in North America, Europe, the Middle East, and APAC.
Lead the design, implementation, and ongoing improvement of reliable, scalable, performant, and secure production platforms and services.
Work closely with cross-functional teams to build and maintain resilient infrastructure and deployment patterns.
Provide technical leadership and mentorship to engineers across the organisation, promoting strong engineering standards and operational best practice.
Cision empowers individuals to make an impact and values diverse perspectives. They foster curiosity, collaboration, and innovation while driving meaningful contributions to brands; they have offices in 24 countries throughout the Americas, EMEA and APAC.
Own Render's core network infrastructure across multiple data centers and cloud providers, shaping how networking evolves as Render rapidly scales.
Design and build customer-facing networking capabilities that give users greater flexibility in how their services connect and communicate, and how traffic is routed.
Investigate complex networking issues across the stack, from the kernel and data plane to distributed systems and edge networking.
Render is building a modern cloud platform for developers creating AI-native, full-stack, multi-service applications, eliminating the tradeoff between hyperscaler power and developer-friendliness. They are a diverse and talented team that values craft, velocity, and user experience.
Define and drive Webflow’s end-to-end deployment strategy, from CI pipelines to production rollout.
Own and evolve our GitOps-based deployment platform (e.g., ArgoCD, Argo Rollouts), ensuring scalability, reliability, and ease of use.
Set technical direction across teams, guiding architecture decisions that improve developer velocity and system reliability.
Webflow is building the world’s leading AI-native Digital Experience Platform as a remote-first company built on trust, transparency, and a whole lot of creativity. They empower teams to design, launch, and optimize for the web, and believe the future of the web, and work, is more open, more creative, and more equitable.
Design, build, and operate reconciliation systems to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
Improve operational efficiency by reducing deployment complexity and contributing to the Stack Config Reconciliation project.
Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack. Their team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.
Deploy and maintain infrastructure using Terraform on AWS.
Operate and govern production-grade platforms running on Kubernetes / EKS.
Build and maintain CI/CD pipelines using GitHub Actions.
Muttdata is a dynamic startup committed to crafting innovative systems using cutting-edge Big Data and Machine Learning technologies. They are looking for a hands-on DevOps to join a strategic initiative focused on deploying and operating Data & AI platforms.
Designing and managing cloud-based infrastructure on AWS.
Creating and maintaining deployment architectures and continuous delivery pipelines.
Automating infrastructure provisioning and management using Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.
Nearform is an independent team of data & AI experts, engineers, and designers who build intelligent digital solutions and capability at pace. Our team of 500 experts in 20+ countries is trusted by leading enterprises.
Lead software engineering teams providing infrastructure-as-code to manage cloud infrastructure.
Hire experienced site reliability staff, and a line manager to grow and oversee the SRE team.
Establish design-before-build discipline; facilitate lightweight design documents, architectural decision records, and working group reviews.
Horizon3.ai is a cybersecurity company dedicated to enabling organizations to proactively find, fix, and verify exploitable attack vectors. They are a fast-growing company with a culture of respect, collaboration, ownership, and results.
Lead the architecture of a high-scale AWS environment optimized for AI workloads.
Manage and mentor a high-performing team of 8 engineers, providing technical leadership and career coaching.
Conduct user research with internal Natera developers to identify friction points.
Natera is a global leader in cell-free DNA (cfDNA) testing, dedicated to oncology, women’s health, and organ health. The Natera team consists of statisticians, geneticists, doctors, laboratory scientists, business professionals, software engineers, and many other professionals from world-class institutions.