Lead the investigation and resolution of complex infrastructure, networking, and platform-related incidents.
Provide technical leadership for Kubernetes platform operations and supporting infrastructure services.
Mentor and support AI Infrastructure & Platform Operations Engineers, sharing technical knowledge through documentation and training.
Mirantis helps organizations ship code faster on public and private clouds, providing a public cloud experience on any infrastructure from the data center to the edge. The company serves many of the world's leading enterprises, including Adobe, DocuSign, Liberty Mutual, and PayPal, and is a leader in container management.
Design, deploy, and manage production Kubernetes clusters with workload scheduling, resource quotas, network policies, and RBAC.
Build and optimize CI/CD pipelines using Infrastructure as Code and GitOps principles.
Implement observability solutions using Prometheus, Grafana, and OpenTelemetry for performance tuning and reliability.
VerTALENTS is a subsidiary of VerSprite Cybersecurity, specializing in technology staffing. The company connects top technical talent with industry clients through various methods, adding value to both clients and candidates for full-time and contracting opportunities.
Troubleshoot and resolve issues in customer environments based on Linux, OpenStack, Kubernetes, and networking technologies, owning escalations end-to-end.
Reproduce customer issues in labs, confirm bug reports, and collaborate with the development team to improve product stability.
Communicate with customers during incidents via email and remote sessions, guiding them through troubleshooting and resolution processes.
Mirantis is the Kubernetes-native AI infrastructure company, enabling organizations to build and operate scalable, secure, and sovereign infrastructure for modern AI and data-intensive applications. With deep expertise in open source and Kubernetes, Mirantis empowers platform engineering teams across enterprises worldwide.
Own Render's core network infrastructure across multiple data centers and cloud providers, shaping how networking evolves as Render rapidly scales.
Design and build customer-facing networking capabilities that give users greater flexibility in how their services connect and communicate, and how traffic is routed.
Investigate complex networking issues across the stack, from the kernel and data plane to distributed systems and edge networking.
Render is building a modern cloud platform for developers creating AI-native, full-stack, multi-service applications, eliminating the tradeoff between hyperscaler power and developer-friendliness. They are a diverse and talented team that values craft, velocity, and user experience.
Set delivery standards using Terraform, GitOps, and progressive rollout, while improving SLOs and alerting on Grafana Cloud.
Docker is a developer tooling company trusted by over 20 million monthly users and 20 billion container image pulls. They are a globally distributed, remote-first team building tools that define how software gets built and delivered.
Build and maintain CI/CD pipelines and deployment infrastructure.
Leverage AI to automate analysis and resolution of production issues.
Fal is the generative media ecosystem powering the next generation of AI products. They build the infrastructure, tools, and model access that teams need to move from idea to production.
Act as the primary NVIDIA AI Enterprise and vector database expert for HyperPOD customer environments, owning end-to-end triage across GPU, NVAIE services, and storage.
Author and maintain support triage runbooks, diagnostics bundles, and collaborate on observability dashboards for platform health and RAG metrics.
Build hands-on labs, PoCs, and reusable technical assets to accelerate support readiness and partner success.
DataDirect Networks (DDN) is a global market leader in AI and high-performance data storage, powering many of the world's most demanding AI data centers across industries like life sciences, healthcare, financial services, and research. They are a global company with strong innovation, customer-centricity, and a team of passionate professionals committed to shaping the future of AI and data management.
Monitor, operate, and support production AI infrastructure platforms.
Investigate and resolve infrastructure, networking, hardware, and platform-related incidents.
Collaborate with engineering teams, hardware vendors, and datacenter personnel to resolve technical issues.
Mirantis is the Kubernetes-native AI infrastructure company, enabling organizations to build and operate scalable, secure infrastructure for AI and data-intensive applications. The company is growing and invests heavily in AI infrastructure and platform services.
Partner with Account Executives to uncover customer pain points and compliance requirements.
Deliver technical presentations on RapidFort’s DevTime Protection Tools and Runtime Protection capabilities.
Advise customers on vulnerability management strategies and compliance frameworks.
RapidFort's Software Supply Chain Security platform helps organizations reduce vulnerabilities, harden containerized workloads, and minimize attack surface without requiring code changes. They are a fast-paced startup where adaptability, ownership, and execution are critical.
Develop and maintain automated provisioning pipelines for bare-metal servers across global data centers.
Perform security monitoring, repair and recover from hardware or software failures.
Act as technical lead, mentor engineers, and report directly to the CTO.
Kayzen is a mobile demand-side platform (DSP) that democratizes programmatic advertising. With 160B+ daily ad requests and 1B+ ads served per day globally, it powers top mobile marketing teams with a focus on performance, transparency, and control.
Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.
Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.
Build and implement tooling for efficient development and release.
Lead the design and development of release tooling.
Champion effective engineering and operational patterns.
Benchling is the AI platform for biotech R&D, used by scientists to design experiments, capture structured data, and run AI agents. Over 200,000 scientists trust Benchling to power their work, from academic labs to Sanofi and Moderna.
Design, deploy, and manage Kubernetes-based platforms in production.
Implement and manage automation frameworks for infrastructure provisioning and operations.
Administer and optimize VMware environments (vSphere, ESXi, vCenter).
EPlus believes technology is a people business and delivers solutions that make a real difference. Their team is passionate, skilled, and driven, valuing collaboration, innovation, and extraordinary results and dedicated to fostering, cultivating, and preserving a culture that represents diversity, enables inclusion.
Own the technical direction of Remote's SRE/Platform domain.
Define and drive the reliability strategy across the platform.
Identify and lead AI enablement initiatives across the engineering organisation.
Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. With our core values at heart and a future-focused work culture, our team works tirelessly on ambitious problems, asynchronously, around the world.
Partner with strategic customers during onboarding to maximize value from our platform.
Design tailored Kubernetes deployment and integration solutions.
Write scripts and tooling to accelerate customer time-to-value.
Edera is dedicated to making secure computing simple. The products and innovations we release will change everything. We operate as a team and value diversity in all its forms, understanding that different perspectives drive our success.
Lead a distributed team, driving decision-making and ensuring day-to-day operations.
Define the technical roadmap in alignment with organizational goals.
Cultivate a culture of collaboration, continuous learning, and engineering excellence.
Deel is an all-in-one payroll and HR platform for global teams. They combine HRIS, payroll, compliance, benefits, performance, and equipment management into one seamless platform and has a team of 7,000 spanning more than 100 countries.
Design and build large-scale distributed systems and high-throughput data pipelines using Go and cloud-native technologies.
Lead system-wide architectural decisions focusing on data flow, performance, and resilience.
Champion best engineering practices, code quality, testing, and maintainability while mentoring junior engineers.
DoiT is a global technology company that helps organizations leverage the cloud for business growth, combining data, technology, and human expertise. With thousands of customers worldwide, DoiT fosters a remote-first culture that values entrepreneurship, knowledge pursuit, and fun.
Contribute to the development of cutting-edge platforms across various cloud providers and data centers.
Collaborate within a cross-functional team to design and implement next-generation CI/CD platforms-as-a-service for internal product solutions.
Monitor system health, capacity, and performance indicators, driving optimization and proactive improvements.
NBCUniversal is a leading media and entertainment company creating and distributing world-class content across film, television, and streaming. They own brands such as NBC, Telemundo, and Peacock, and operate film and television studios including Universal Pictures and DreamWorks Animation, employing a talented workforce.
Lead and manage Technical Implementation Leads and Software Developers.
Provide hands-on technical leadership across discovery, solution design, estimation, implementation planning, troubleshooting, testing, deployment, and operational readiness activities.
Lead technical implementation delivery across multiple concurrent customer projects while ensuring delivery quality, scalability, and maintainability.
Smile Digital Health helps healthcare stakeholders collect and exchange data with our FHIR-based data liberation platform. Their platform enables people and organizations to better manage healthcare data and liberate structured healthcare data. They were #19 on Deloitte's Technology Fast 50 Ranking for 2024!
Design, build, and operate core cloud infrastructure across compute, storage, databases, and networking layers.
Own and improve the reliability, scalability, and security of Valon’s production systems as we scale to support major enterprise deployments.
Evaluate, adopt, and operationalize new infrastructure technologies (e.g., Vitess, Clickhouse, Redis) to meet evolving product and scale requirements.
Valon is building the AI-native operating system for regulated finance, starting with mortgage servicing. They're a Series C company backed by a16z, transforming industries that others have written off as too complex to innovate.