Work directly with customers to ensure successful Teleport deployments.
Meet regularly with customers, understand pain points blocking deployments and remove roadblocks.
Work with customers to articulate the problem they are trying to solve, gather requirements, and make the business case to the product and engineering teams to invest in resolving the issue.
Apply experience of IaC to develop infrastructure as code practice.
Automate software operations for re-usability and consistency across private and public clouds.
Collaborate with development teams to design service architecture, documentation, playbooks, policies and operational procedures.
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. With 1200+ colleagues in 75+ countries, it's a pioneer of global distributed collaboration with very few office-based roles and a founder-led, profitable, and growing company.
Architect, operate, improve and secure the platform the Garner Health app runs on
Boost development velocity and productivity
Build systems to a high engineering standard and hold others to the same high standard
Garner has developed a revolutionary approach to evaluating doctor performance and a unique incentive model that's reshaping the healthcare economy to ensure everyone can afford high quality care. They have more than doubled their revenue annually over the last 5 years. Garner's award winning culture is designed to cultivate teamwork, trust, autonomy, exceptional results, and individual growth.
Ensure the smooth operation and high availability of Clarifai's core services
Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
Design and implement scalable, secure, and cost-effective infrastructure solutions
Clarifai is a leading AI platform specializing in computer vision and generative AI, empowering organizations to transform unstructured data into actionable insights. Founded in 2013, they have a diverse, globally distributed team with $100M in funding and are committed to building a diverse and inclusive team.
Design and implement highly scalable infrastructure for GitLab.com to support current and future growth.
Collaborate with cross-functional teams across the Infrastructure organization to plan and deliver projects that shape GitLab’s platform direction.
Operate and improve edge services and Kubernetes workloads, acting as a subject matter expert within the infrastructure department.
GitLab is an open-core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. They aim to enable everyone to contribute to and co-create the software that powers our world.
Leverage infrastructure as code (Terraform) to build and maintain complex production and analytics workflows including networking and containerized services.
Rapidly diagnose and resolve faults in system services as part of a 24/7 on-call rotation focused on actionable alerting and eliminating toil.
Improve speed of delivery by developing and maintaining CI/CD pipelines.
Linus Health is a Boston-based digital health company transforming brain health worldwide. They combine cutting-edge neuroscience, clinical expertise, and AI to advance early detection and intervention for cognitive and brain disorders, empowering people to live longer, healthier lives. With 100+ team members and growing, they’re entering a phase of accelerated growth and looking for top talent to help shape their future.
Design, build, and maintain highly available, scalable infrastructure.
Manage and optimize infrastructure across GCP, AWS, Azure, and other cloud providers.
Develop comprehensive monitoring, logging, and alerting systems.
Bobsled is seeking a Site Reliability Engineer to enhance its data-sharing platform's reliability and scalability. We're a company that values growth, offering flexible work hours in a fully remote environment and fully sponsored individual coaching for all employees.
Design, implement, and manage cloud infrastructure using Infrastructure as Code (IaC) tools.
Design, build, and maintain scalable CI/CD pipelines using tools like CircleCI or GitHub Actions.
Implement and maintain observability tooling (Prometheus, Grafana, Datadog), and lead incident response to ensure system reliability.
Engine is transforming business travel into something personalized, rewarding, and simple. More than 20,000 companies already rely on Engine to support over 1 million travelers and billions in annual bookings each year.
Contribute to high impact AWS cloud infrastructure initiatives.
Participate in operability and production readiness reviews.
Advocate and implement Site Reliability Engineering practices.
Patreon is a media and community platform where creators give fans access to exclusive work. They have generated over $10 billion for creators and have 25 million+ paid memberships, with a hybrid work model and offices in New York and San Francisco.
Work with cutting edge infrastructure tools like Docker, Kubernetes, Terraform, Helm, and Istio
Accelerate development across the company with faster, safer, and more frequent deploys
Meaningfully improve developer happiness and productivity across the company with better development tools and workflows
Super.com helps people save more, earn more, and get more out of life. For employees, it is an opportunity to grow, make an impact, and unlock your full potential; they invest in learning, celebrate bold ideas, and create pathways for career growth.
Extend and automate the existing container orchestration platform, ensuring its scalability, reliability, and performance
Work closely with SREs from different teams to reduce their cognitive load related to the orchestration platform
Implement and maintain security best practices for the orchestration platform, ensuring the security and availability of our systems
Kraken is a mission-focused company rooted in crypto values. They aim to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. As a fully remote company, Kraken has employees in 70+ countries who speak over 50 languages.
Build Enterprise-Scale Infrastructure leveraging infrastructure-as-code to manage complex cloud environments.
Sustain Platform Health and Performance owning critical systems in production, including reliability and security.
Enable Teams and Customers to Move Faster creating abstractions and tooling that deploy, run, and scale AI/ML workloads.
Cake is on a mission to make cutting-edge AI accessible to enterprise teams. Backed by top investors, Cake is seeing strong adoption and is positioned for rapid growth in the next 12 months, emphasizing ownership, clear communication, and collaboration.
Designing, building, and maintaining infrastructure that enables fast, reliable, and secure product delivery.
Improving and maintaining CI/CD pipelines to streamline deployments and increase reliability.
Contributing to infrastructure reliability and ensuring systems are designed for resilience and growth.
Incident.io is the leading AI incident response platform, built to help teams dramatically reduce incident response time and improve reliability. They have raised $100M from Index Ventures, Insight Partners, and Point Nine, alongside founders and executives from world-class technology companies.
Contribute to the core product, working across the stack on services that power their applications.
Design and refine technical systems, sharing ownership of customer use cases and the systems that power them.
Work directly with customers, providing pointers to documentation and debugging issues.
Humanitec is reshaping how enterprises build and run their cloud-native setups, and leading the transformation, helping teams build Internal Developer Platforms (IDPs) that unlock true developer self-service. They value humility, drive, and intelligence in their fully remote team.
Respond to production incidents and contribute to post-incident analysis.
Identify and automate manual processes to improve efficiency and reduce risk.
Enhance monitoring tools and platforms to improve observability.
Restaurant365 is a SaaS company that provides a unique, centralized solution for accounting and back-office operations for restaurants. They focus on empowering team members to produce top-notch results while elevating their skills.
Architect and deploy secure, scalable infrastructure using Terraform, CloudFormation, or similar tools.
Ensure the platform meets strict SLA requirements for enterprise clients, minimizing downtime.
Implement comprehensive monitoring, logging, and alerting to provide deep visibility into system health.
Filevine provides cloud-based workflow tools for legal professionals, helping them manage organizations and serve clients. They are recognized as a fast-growing and innovative technology company with a team of passionate professionals.
Contribute to our core product, primarily in Go, on services that power our applications.
Design and refine technical systems, helping to shape them to remain scalable, reliable, and elegant.
Collaborate closely across disciplines to explore problems, prototype ideas, and iterate quickly.
Humanitec is reshaping how enterprises build and run their cloud-native setups and helps teams build Internal Developer Platforms (IDPs) that unlock true developer self-service. They are a fully remote company where small teams work closely.
Design, provision, and manage cloud infrastructure using Infrastructure as Code
Operate and support Kubernetes clusters in production environments
Build and maintain GitOps-based deployment and configuration workflows
BETSOL is a cloud-first digital transformation and data management company offering products and IT services to enterprises in over 40 countries. They are an employee-centric organization, offering comprehensive health insurance, competitive salaries, 401K, volunteer programs, and scholarship opportunities.
Own the operational stability and performance of Juul’s hybrid cloud infrastructure.
Lead automation efforts and architect for reliability.
Act as the final escalation point for critical incidents.
Juul Labs aims to transition the world’s billion adult smokers away from combustible cigarettes and eliminate their use, while also combating underage usage of their products. They are backed by leading technology investors and are committed to hiring great talent and building a diverse team.
Support and evolve the reliability of platforms used by the AI Research team.
Ensure production services meet expectations for availability, latency, and operational readiness.
Build and maintain Kubernetes-based services on GCP using infrastructure-as-code and GitOps.
Algolia is a pioneer and market leader in AI Search, empowering 17,000+ businesses to deliver blazing-fast, predictive search and browse experiences. They have raised $150 million in Series D funding, quadrupling their valuation to $2.25 billion, investing in their market-leading platform.
Automate infrastructure provisioning, configuration management, monitoring, and operational workflows using IaC and scripting languages.
Own the deployment, maintenance, and lifecycle management of systems supporting engineering, leveraging deep expertise in Kubernetes.
Troubleshoot complex infrastructure and application issues, driving root-cause analysis and developing long-term remediation solutions
SingleStore delivers the cloud-native database with the speed and scale to power the world’s data-intensive applications. They are venture-backed and headquartered in San Francisco with offices in Sunnyvale, Raleigh, Seattle, Boston, London, Lisbon, Bangalore, Dublin and Kyiv.