Design, build, and maintain infrastructure using Infrastructure as Code tools such as Terraform.
Improve system reliability, scalability, resilience, and performance across the Mast platform.
Build systems and tooling that automate infrastructure management and operational workflows wherever possible.
Mast is on a mission to make complex lending simple by building modern, cloud-native lending technology purpose-built for specialist lenders. It is a high-performance team of engineers and lending experts that values radical honesty, transparency, and speed.
Design and operate our Kubernetes ecosystem with a focus on high availability and zero-downtime operations.
Own and evolve our PaaS strategy, using GitOps and CI/CD to empower domain teams to deploy independently.
Define and implement our observability strategy across metrics, logs, and tracing.
Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs. They offer an all-in-one financial B2B solution integrating banking, accounting, financial management, and invoicing into a mobile-first platform, with about 346 million in funding.
Define and evolve the architecture and roadmap for enterprise‑scale Data and AI platforms.
Design and build multi‑tenant, multi‑region, highly available AI platforms with governance.
Lead capacity planning and cost optimization strategies for GPU and CPU workloads.
NEORIS accelerates growth in Ibero‑America, combining global engineering with regional expertise. With over 60,000 professionals across 55+ countries, they offer technical specialization career paths and value responsibility, collaboration, creativity, and commitment.
Provide technical leadership for infrastructure, reliability, and observability.
Own the observability stack using Datadog and CloudWatch.
Design and evolve AWS infrastructure for reliability, security, scalability, and cost efficiency.
Topstep is an engaging working environment that ranges from fully remote to hybrid. They foster a culture of collaboration by keeping cameras on during meetings and maintaining a robust Slack environment for communication.
Design, build, and deploy production systems with a focus on scalability, reliability, observability, and performance.
Develop and maintain comprehensive automation solutions to eliminate toil and streamline operational efficiency.
Proactively monitor production systems and implement automated incident response mechanisms to minimise downtime.
Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. The company is well-established and profitable with over $8 billion in revenue and values diversity and inclusivity.
Own the technical direction of Remote's SRE/Platform domain.
Define and drive the reliability strategy across the platform.
Identify and lead AI enablement initiatives across the engineering organisation.
Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. With our core values at heart and a future-focused work culture, our team works tirelessly on ambitious problems, asynchronously, around the world.
Help build the platform that lets people across Greenhouse build, deploy, and run their own agents and automations against Greenhouse's data and tools.
Stand up whatever infrastructure EA's application engineers need to ship reliably, including services, runtime, deployment, scheduling, and observability.
Provide services related to EA's AWS footprint end-to-end, including networking, IAM, secrets, security posture, and deployment automation.
Greenhouse's mission is to make hiring work for everyone; they hire great people because they believe that they’re the foundation of their success. The company collaborates purposefully, fosters inclusivity, and communicates with transparency and accountability.
Drive the vision, strategy, and technical execution for the company's core operational and client-facing platforms, AMP and SERA.
Lead a unified, international engineering organization to build AI-assisted automation and client visualization dashboards that translate complex backend operations.
Act as a hands-on player-coach to integrate cross-border teams and foster a culture of high accountability and continuous improvement.
Atmosera provides modern cloud solutions leveraging Applications, Data & AI, DevOps, Security, and the Microsoft Azure platform. It is a Microsoft Partner with specialized teams focused on accelerating innovation, enhancing security, and optimizing operational agility for clients.
Lead software engineering teams providing infrastructure-as-code to manage cloud infrastructure.
Hire experienced site reliability staff, and a line manager to grow and oversee the SRE team.
Establish design-before-build discipline; facilitate lightweight design documents, architectural decision records, and working group reviews.
Horizon3.ai is a cybersecurity company dedicated to enabling organizations to proactively find, fix, and verify exploitable attack vectors. They are a fast-growing company with a culture of respect, collaboration, ownership, and results.
Partner with Automation/AI leads on agent infrastructure and AI-assisted internal tools.
Build and maintain internal automations and integrations across our stack.
Document and enable the rest of the company to self-serve through better systems.
Customer.io's platform empowers over 8,000 companies to send billions of emails, push notifications, in-app messages, and SMS daily. We help teams send smarter, more relevant messages using real-time behavioral data, automating communication that people actually want to receive.
Own and evolve CI/CD pipelines using GitHub Actions and OIDC-based authentication for microservices and agentic workloads.
Automate infrastructure provisioning using Infrastructure as Code tools such as Terraform and CloudFormation.
Operate and scale our Kubernetes platform, including autoscaling, ingress, and multi-tenant isolation for enterprise customers.
Zingtree is a next-generation intelligent process automation platform reimagining customer experience operations for enterprise support leaders. It is a small team with high ownership, emphasizing automation, collaboration, and transparency.
Own and evolve Quansight's cloud infrastructure across AWS, Azure, and GCP.
Build, deploy, and maintain internal dashboards and reporting for operations and project management.
Lead infrastructure engagements for clients from scoping and architecture through delivery, upskilling client teams.
Quansight is rooted in the Python and PyData ecosystems. They provide services ranging from open-source software development to training and consulting, believing in a culture of do-ers, learners, and collaborators.
Own the delivery of developer platform capabilities end-to-end, including design, implementation, rollout, and iteration.
Build and evolve paved roads that make it easy to deploy, operate, and scale services.
Drive improvements to GitOps workflows and harden CI/CD to improve pipeline performance and developer ergonomics.
Phaidra is building the future of industrial automation with AI-powered control systems. They are a 100% remote company with employees located throughout the USA, Canada, UK, Sweden, Spain, Portugal, the Netherlands, Singapore, Australia, and India.
Build and maintain CI/CD pipelines and deployment infrastructure.
Leverage AI to automate analysis and resolution of production issues.
Fal is the generative media ecosystem powering the next generation of AI products. They build the infrastructure, tools, and model access that teams need to move from idea to production.
Design, build, and operate reconciliation systems to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
Improve operational efficiency by reducing deployment complexity and contributing to the Stack Config Reconciliation project.
Grafana Labs is a remote-first, open-source powerhouse with over 20M users of Grafana. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).
Own the end-to-end infrastructure product vision, including installers, deployment tooling, reference architectures, and operational patterns.
Define and evolve a cohesive infrastructure roadmap aligned with Platform architecture, customer needs, and GTM strategy.
Partner closely with Product Leadership to balance near-term customer needs with long-term platform scalability and repeatability.
Mechanical Orchard is reinventing how the world’s most critical software gets modernized, focusing on system behavior to turn modernization into a repeatable process. They are an applied AI company challenging industry assumptions and prioritizing quality, rigor, and progress.
Build, lead, and grow the platform team, setting the pace and creating an environment where strong engineers want to stay.
Remain hands-on by writing code, reviewing architecture decisions, and debugging production issues while owning the platform's technical direction.
Steer projects through ambiguity, solving technical problems, resourcing gaps, and prioritization calls to ensure the infrastructure scales effectively.
OpenRouter is the leading AI routing and infrastructure layer that enterprises use to access, manage, and optimize the best large language models across providers. It's a fast-scaling technology company powering advanced AI teams by providing flexibility, scalability, and future-proof infrastructure.
Own the performance, stability, and uptime of our global Azure infrastructure.
Lead and develop a high-performing operations team.
Drive automation and reduce manual operational overhead.
CluedIn is reshaping data management with an Azure-native, graph-based Modern Master Data Management (MDM) platform. They are trusted by global industry leaders and backed by five-star reviews on Gartner Peer Insights.
Defining and driving the vision and strategy for Infrastructure Observability.
Identifying gaps in end to end experience, defining and owning the roadmap to fill those gaps.
Working closely across teams and across Orgs, collaborating with Engineering, UX, Design and other teams to deliver on your roadmap.
Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter.
Manage and grow a distributed team of engineers, providing feedback and supporting career development.
Partner with product management to shape the Usage squad's roadmap, ensuring alignment with company mission and customer impact.
Guide the team through the full project lifecycle, ensuring high-quality and timely outcomes within the Usage domain.
Grafana Labs is a remote-first, open-source powerhouse with over 20M users globally. Their team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.