Work directly with customers to ensure successful Teleport deployments.
Meet regularly with customers, understand pain points blocking deployments and remove roadblocks.
Work with customers to articulate the problem they are trying to solve, gather requirements, and make the business case to the product and engineering teams to invest in resolving the issue.
Teleport is the Infrastructure Identity Company, modernizing identity, access, and policy for infrastructure, improving engineering velocity and resiliency of critical infrastructure against human factors and/or compromise. They are a fast-growing, well-funded Y-Combinator company that values craft, strongly supports work/life balance, and embraces a culture of humility, honesty, and transparency.
Operate and maintain large-scale data systems, ensuring stability and performance.
Design, implement, and optimize deployment processes using virtualization.
Monitor system health, analyze failures, and identify instability sources.
Jobgether is a platform that uses AI-powered matching to connect candidates with companies. They ensure applications are reviewed quickly, objectively, and fairly, then share a shortlist of top candidates directly with the hiring company.
Own the reliability, scalability, and performance of Peec AI’s core systems and infrastructure
Design, build, and maintain the tooling, automation, and monitoring that keep our services fast, secure, and highly available
Partner closely with product and engineering teams to ensure new features are reliable, observable, and easy to operate from day one
Peec AI is one of Europe’s fastest-growing Series A startups (no employee count/culture details given). They provide exciting and challenging work in the AI space.
Contribute to high impact AWS cloud infrastructure initiatives.
Participate in operability and production readiness reviews.
Advocate and implement Site Reliability Engineering practices.
Patreon is a media and community platform where creators give fans access to exclusive work. They have generated over $10 billion for creators and have 25 million+ paid memberships, with a hybrid work model and offices in New York and San Francisco.
Lead cross-team infrastructure security initiatives from design through delivery, owning technical outcomes and stakeholder communication
Design and implement security solutions for cloud infrastructure, container platforms, and orchestration systems
Partner with SRE, Infrastructure, and Engineering teams to integrate security into platform services and deployment pipelines
GitLab is an open-core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Their mission is to enable everyone to contribute to and co-create the software that powers our world.
Develop automation to eliminate manual and repetitive operational tasks.
Investigate and resolve customer complaints escalated beyond L1 and L2 support.
Moniepoint is an all-in-one financial services platform for emerging markets. Since 2019, Moniepoint’s technology has powered over 3 million people, offering personal and business banking, payment, credit and business management tools to help them succeed.
Automate the provisioning of all of Juniper Square’s infrastructure in code.
Partner with our Platform Engineering team on building developer tooling / improving developer experiences via joint initiatives and enhancements.
Partner with our Data Engineering team on improving our data posture and driving operational excellence.
Juniper Square's mission is to unlock the full potential of private markets by digitizing them to bring efficiency, transparency, and access. They are a values-driven organization with a hybrid workplace strategy, allowing employees to collaborate effectively across multiple countries and offering physical offices in several major cities.
Lead reliability-focused design and readiness reviews.
Build, operate, and continuously improve our observability stack.
Own and evolve incident management practices.
Transcend is building the privacy platform that easily embeds privacy into your entire tech stack. They are growing quickly, backed by top-tier investors and are proud to serve some of the world's most iconic brands.
Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
Metrics, logging, analytics, and alerting for performance and security across all endpoints and applications.
Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.
Own the operational stability and performance of Juul’s hybrid cloud infrastructure.
Lead automation efforts and architect for reliability.
Act as the final escalation point for critical incidents.
Juul Labs aims to transition the world’s billion adult smokers away from combustible cigarettes and eliminate their use, while also combating underage usage of their products. They are backed by leading technology investors and are committed to hiring great talent and building a diverse team.
Contribute to building and operating the infrastructure that supports the HackerOne platform.
Improve the reliability, security, and scalability of our systems.
Design and operate highly available cloud systems and apply best practices for reliability, observability, and security.
HackerOne is a global leader in Continuous Threat Exposure Management (CTEM). The HackerOne Platform unites agentic AI solutions with the ingenuity of the world’s largest community of security researchers to continuously discover, validate, prioritize, and remediate exposures across code, cloud, and AI systems. They combine the ingenuity of the largest security research community with a best-in-class AI-powered platform, trusted by the world’s top organizations.
Design, implement, and manage cloud infrastructure using Infrastructure as Code (IaC) tools.
Design, build, and maintain scalable CI/CD pipelines using tools like CircleCI or GitHub Actions.
Implement and maintain observability tooling (Prometheus, Grafana, Datadog), and lead incident response to ensure system reliability.
Engine is transforming business travel into something personalized, rewarding, and simple. More than 20,000 companies already rely on Engine to support over 1 million travelers and billions in annual bookings each year.
Lead effective squad rituals and ensure production readiness.
Partner with engineers to ensure solutions are scalable, architecturally sound, flexible, and secure.
Provide timely, specific coaching and development opportunities for your direct reports.
Customer.io's platform allows over 8,000 companies to send messages using real-time behavioral data. Their team uses Go, React, Ember, and AI to ship fast and scale with confidence and they value ownership, leadership, and healthy skepticism.
Respond to production incidents and contribute to post-incident analysis.
Identify and automate manual processes to improve efficiency and reduce risk.
Enhance monitoring tools and platforms to improve observability.
Restaurant365 is a SaaS company that provides a unique, centralized solution for accounting and back-office operations for restaurants. They focus on empowering team members to produce top-notch results while elevating their skills.
Monitor production systems, dashboards, logs, and alerts to ensure high availability and performance across distributed environments.
Assist in incident detection, triage, escalation, and resolution, following structured on-call rotations with mentorship support.
Maintain, follow, and continuously improve runbooks, operational procedures, and incident response workflows.
Jobgether is a platform that helps job seekers find the right opportunities. They use an AI-powered matching process to ensure applications are reviewed quickly and fairly.
Design, deploy and maintain a cloud infrastructure to support a Dataiku SaaS offering mainly on AWS and Azure and GCP
Continuously improve the infrastructure, deployment and configuration to deliver more reliable, resilient, scalable and secure services
Automate as much as possible all technical operations
Dataiku is The Universal AI Platform™, giving organizations control over their AI talent, processes, and technologies to unleash the creation of analytics, models, and agents. They connect many data science technologies and integrate the best of data and AI tech.
Own and operate core platform systems across AWS, GCP, Vercel, Github, and Cloudflare.
Improve reliability, scalability, and security of production and non-production environments.
Improve local development environments and onboarding experience for engineers.
Moxie empowers ambitious aesthetic entrepreneurs to build profitable, independent practices. A global, remote-first team of more than 140 people supports hundreds of practices nationwide as they unlock sustainable success for aesthetic entrepreneurs.
Extend and automate the existing container orchestration platform, ensuring its scalability, reliability, and performance
Work closely with SREs from different teams to reduce their cognitive load related to the orchestration platform
Implement and maintain security best practices for the orchestration platform, ensuring the security and availability of our systems
Kraken is a mission-focused company rooted in crypto values. They aim to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. As a fully remote company, Kraken has employees in 70+ countries who speak over 50 languages.
Design and evolve production environments, define standards and best practices.
Partner with engineering and IT teams to build scalable, reliable systems.
Lead incident response practices, and set guardrails around security, reliability, and cost management.
They are looking for a Senior Site Reliability Engineer who can own the architecture, governance, and cost efficiency of their cloud and platform infrastructure. This role is a remote contractor role and they are seeking candidates located in LATAM.
Contribute to the design, development, and implementation of platform components and services using infrastructure-as-code principles.
Identify opportunities for automation and develop solutions to streamline operational tasks, improve efficiency, and reduce manual intervention.
Participate in the provisioning, configuration, and management of AWS resources, ensuring adherence to best practices and security standards.
Mambu is a leading SaaS cloud banking platform, aiming to improve banking for a billion people. Mambu offers exciting career opportunities and helps shape the future of financial services; their culture is vibrant, and they value diversity.