Collaborate with application engineering teams on platform infrastructure.
Enhance observability and spearhead the adoption of SRE best practices.
Build and maintain reliable CI/CD pipelines, tooling, and infrastructure.
Rula strives to provide quality, evidence-based, compassionate mental healthcare and aims to create a world where mental health is no longer stigmatized. They are a remote-first company operating in most U.S. states, and are dedicated to having a culture of inclusion that supports their employees.
Define and evolve reliability standards for the SmarterDx platform.
Enhance observability systems (metrics, logs, traces, alerting) to provide actionable insights and reduce mean time to detect (MTTD) and resolve (MTTR).
Reduce operational toil through automation, self-healing systems, and improved deployment and rollback mechanisms.
SmarterDx, a Smarter Technologies company, builds clinical AI that is transforming how hospitals translate care into payment. Founded by physicians in 2020, their platform connects clinical context with revenue intelligence, helping health systems recover millions in missed revenue, improve quality scores, and appeal every denial.
Collaborate with engineering teams to design and implement scalable, secure systems.
Establish and manage service level objectives (SLOs) and service level agreements (SLAs).
Enhance incident response processes and post-mortem analysis for outages.
ClickHouse, recognized on the 2025 Forbes Cloud 100 list, is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads.
Own SLI/SLO/SLA definitions for the Akuity SaaS platform and drive continuous improvement.
Participate in an on-call rotation and act as incident commander for high-severity production events.
Partner with engineering teams to build reliability into new features before they ship to production
Akuity helps enterprises ship software faster and more reliably with modern GitOps best practices. The Akuity Platform enables teams to manage the development and deployment across hundreds – if not thousands – of Kubernetes clusters from a single control plane.
Architect new and existing systems to enhance performance, reliability, and scalability.
Build, implement, iterate over CI/CD pipelines.
Assist with the Management, Development, Design, and Deployment of microservice and containerized applications.
AbbVie's mission is to discover and deliver innovative medicines and solutions that solve serious health issues today and address the medical challenges of tomorrow. They strive to have a remarkable impact on people's lives across several key therapeutic areas.
Lead the Infrastructure Engineering team, taking full ownership of cloud infrastructure, Kubernetes platforms, DevOps tooling, and CI/CD pipelines.
Drive reliability, scalability, and security across the production environment while maintaining a sharp focus on developer velocity and business impact.
Mentor and guide engineers across SRE, DevOps, and Database Reliability functions, fostering a culture of operational excellence and pragmatic problem-solving.
Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs with an all-in-one B2B platform. They have raised $346 million, are expanding across key EU markets, and foster innovation, prioritizing research and solutions that benefit users, employees, partners, and the business.
Provide production support on a shift according to the team on-call roster.
Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support.
Continuously monitor the health and performance of our services, systems, and infrastructure.
Granicus is driven by the excitement of building, implementing, and maintaining technology that is transforming the Govtech industry by bringing governments and its constituents together. They have served 5,500 federal, state, and local government agencies and more than 300 million citizen subscribers.
Build and maintain CI/CD pipelines and GitOps workflows across a diverse set of engineering teams.
Own observability — monitoring, alerting, logging — and support development teams in instrumenting their services.
Optimise infrastructure for security, cost, performance and reliability.
1inch is a decentralized finance (DeFi) platform. We empower users to access the best rates and execute efficient and secure trades across multiple liquidity sources.
Monitor and maintain internal platforms to ensure they are secure, up-to-date, and running efficiently
Apply patches, upgrade packages, and coordinate platform version updates
Automate routine maintenance tasks and improve deployment pipelines
Empower's vision is based on the idea that transforming financial lives starts by giving our people the freedom to transform their own. They foster a flexible work environment and fluid career paths, encouraging internal mobility while recognizing the importance of purpose, well-being, and work-life balance.
Ensure the availability, reliability, performance, and security of our SaaS platform
Lead infrastructure automation efforts using Infrastructure as Code and Configuration Management tools
Define and monitor SLAs/SLOs/SLIs, and drive service quality improvements
Remote People builds the infrastructure to power borderless teams. Their technology enables businesses to hire anyone anywhere compliantly at the push of a button. They are committed to building a global, diverse team representing different and varied backgrounds, perspectives, and experiences.
Design and implement resilient, secure, and scalable cloud environments to support client platforms in production.
Drive production readiness and operations: monitoring and alerting, incident support, runbooks, capacity planning, reliability improvements, and release readiness.
Build and maintain CI/CD workflows and reconfigure/enhance an existing proprietary pipeline using Argo.
Kunai builds full-stack technology solutions for banks, credit and payment networks, infrastructure providers, and their customers. The company helps its clients modernize, capitalize on emerging trends, and evolve their business for the coming decades by remaining tech-agnostic and human-centered.
Design, implement, and operate cloud-native infrastructure for production workloads.
PointClickCare's mission is to help providers deliver exceptional care. They are a leading health tech company that’s founder-led and privately held that empowers their employees to push boundaries, innovate, and shape the future of healthcare. They have the largest long-term and post-acute care dataset and a Marketplace of 400+ integrated partners, their platform serves over 30,000 provider organizations.
Collaborate with product teams to implement cloud best practices.
Automate code changes, testing, and analysis using CI tools.
Jobgether is a platform that uses AI to match candidates with jobs. They ensure applications are reviewed quickly, objectively, and fairly against the role's core requirements.
Build and deploy computing services and infrastructure in customer environments.
Clarify and surface requirements from ambiguous use cases defined by cross-functional stakeholders.
Improve reliability and scalability by resolving edge cases, studying failure modes, and writing tests.
Planet designs, builds, and operates the largest constellation of imaging satellites in history. They deliver an unprecedented dataset of empirical information via a revolutionary cloud-based platform to authoritative figures in commercial, environmental, and humanitarian sectors. Planet has a people-centric approach toward culture and community and it strives to iterate in a way that puts their team members first and prepares their company for growth.
Analyze, troubleshoot and resolve operational challenges contributing to defined SLO's.
Manage site stability, performance, reliability, and maintain uptime for production environments.
CentralReach provides autism and IDD care software for Applied Behavior Analysis (ABA), multidisciplinary therapy, and special education. They are trusted by more than 200,000 users and is backed by Roper Technologies, Inc. (Nasdaq: ROP). Their culture is centered around impact, inclusion, and flexibility.
Implementing the improvements to the reliability, fault tolerance, scalability, and performance of our infrastructure
Managing incidents using your technical know-how to involve the appropriate teams and automate away manual practices
Improving observability across our systems (metrics, logs, tracing) to reduce time to detection and resolution
Newton is changing how Canadians trade crypto with the goal to make financial freedom achievable for everyone by giving their customers the tools and knowledge needed to navigate the crypto world. They are a remote team spread across Canada that values pushing boundaries and getting things done.
Support and operate Legion’s AWS-based cloud platform and Kubernetes (EKS) environments.
Build and maintain infrastructure-as-code using Terraform.
Improve CI/CD pipelines to increase deployment safety and velocity.
Legion Technologies delivers the industry’s most innovative workforce management platform. The AI-driven Legion WFM platform maximizes labor efficiency and employee engagement. They are a remote, mission-driven team that embraces a collaborative, fast-paced, and entrepreneurial culture.
Work closely with developers for prototyping, and designing new features as part of the infrastructure.
Deploy, install, configure and maintain sophisticated Trading/Finance and related software.
Build & maintain CI/CD pipelines.
Devexperts works with respected financial institutions, delivering products and tailor-made solutions for retail and brokerage houses, exchanges, and buy-side firms. The company focuses on trading platforms and brokerage automation, complex software development projects, market data products, and IT consulting services.
Design infrastructure, networking, and software platform architecture.
Build and maintain automation of Continuous Integration and Continuous Deployment pipelines.
Troubleshoot infrastructure, internal applications, networking, and security issues.
Loadsmart is a technology company focused on the logistics and supply chain industry. They leverage data and technology to automate and optimize freight transportation, connecting shippers and carriers to streamline the shipping process. They are a mid-sized company passionate about transforming the future of freight.
Take technical ownership of our cloud infrastructure and DevOps practices.
Help us design resilient systems, scale infrastructure, mentor engineers, and collaborate across teams.
Own and improve CI/CD workflows, enabling fast and reliable deployments across teams.
Halcyon is the industry’s first dedicated, adaptive security platform that combines multiple proprietary advanced prevention engines along with AI models focused specifically on stopping ransomware. Formed in 2021 by a team of cyber industry veterans after battling ransomware for years, Halcyon is focused on solutions for mid-market and enterprise customers.