Design, scale, and operate resilient, cloud-native infrastructure in AWS with an emphasis on EKS, IAM, RBAC, and modern security-first practices.
Build and optimize CI/CD pipelines with GitHub Actions and GitHub Advanced Security enabling velocity without compromising safety.
Own observability across the stack using Datadog (metrics, logging, alerting, and tracing).
DexCare optimizes time in healthcare, streamlining patient access, reducing waits, and enhancing overall experiences. They are committed to creating an inclusive workplace where diversity drives innovation and belonging strengthens collaboration, enabling everyone to thrive.
Designs, implements, and continuously improves observability strategies across services.
Focuses on understanding system behavior in production, identifying failure modes, performance bottlenecks, and reliability risks.
Evolves and maintains shared AWS CDK and CDK8s constructs, with emphasis on observability, autoscaling, and operational safeguards.
Truelogic is a leading provider of nearshore staff augmentation services. They have a team of 600+ highly skilled tech professionals based in Latin America, partnering with U.S. companies on impactful projects and valuing expertise and aspirations.
Oversee the reliability, scalability, performance, and security of key production services.
Collaborate with cross-functional teams to develop and maintain resilient infrastructure.
Provide expert mentorship and guidance on best practices to engineers throughout the organization.
Cision is a global leader in PR, marketing and social media management technology and intelligence, helping brands and organizations connect with customers and stakeholders to drive business results. The company has offices in 24 countries throughout the Americas, EMEA and APAC.
Implement and maintain observability tools and dashboards using [e.g., AWS CloudWatch, Datadog, Sentry, OpenTelemetry].
Assist with cloud cost visibility and optimization, analyze infrastructure usage patterns to identify waste and implement aggressive tagging strategies.
Manage the tooling and processes for deploying applications to AWS EKS / Kubernetes / ECS / Serverless and facilitate modern deployment strategies.
True is a global platform of companies that optimizes value creation by placing executive talent, developing business leaders, creating diverse and inclusive networks, and using innovative technology to advance executive talent priorities. True was founded on the belief that doing good is the pathway to doing well and their growth and success are a by-product of their values treating people right, listening to new ideas and keeping culture at the heart of their business.
Architect, operate, improve and secure the platform the Garner Health app runs on
Boost development velocity and productivity
Build systems to a high engineering standard and hold others to the same high standard
Garner has developed a revolutionary approach to evaluating doctor performance and a unique incentive model that's reshaping the healthcare economy to ensure everyone can afford high quality care. They have more than doubled their revenue annually over the last 5 years. Garner's award winning culture is designed to cultivate teamwork, trust, autonomy, exceptional results, and individual growth.
Automate infrastructure provisioning, configuration management, monitoring, and operational workflows using IaC and scripting languages.
Own the deployment, maintenance, and lifecycle management of systems supporting engineering, leveraging deep expertise in Kubernetes.
Troubleshoot complex infrastructure and application issues, driving root-cause analysis and developing long-term remediation solutions
SingleStore delivers the cloud-native database with the speed and scale to power the world’s data-intensive applications. They are venture-backed and headquartered in San Francisco with offices in Sunnyvale, Raleigh, Seattle, Boston, London, Lisbon, Bangalore, Dublin and Kyiv.
Architect and maintain scalable, reliable infrastructure: Design and optimize infrastructure for high availability, fault tolerance, and performance across distributed systems.
Lead incident management and root cause analysis: Own incident response processes, ensure swift resolution of issues, and drive post-incident improvements to prevent recurrences.
Service monitoring and automation: Build and maintain automated monitoring, alerting, and healing systems that improve system health, reduce manual intervention, and minimize downtime.
VGS is the world's leader in payment tokenization, empowering clients and partners by tokenizing sensitive payment data and limiting compliance scope. They embed a universal token vault into their technology stack to manage the complexities of payment data tokenization across processors and networks and more. While the job posting doesn't specify size, they appear to have a culture that values transparency, collaboration, grit, and humility.
Building world-class AI infrastructure to support a 100+ person research team.
Designing and scaling multi-cloud systems that support high-performance model training and inference.
Improving monitoring, alerting and system observability for AI workloads.
Canva is redefining how the world experiences design. They have campuses in Sydney and Melbourne, co-working spaces in Brisbane, Perth, Adelaide and Auckland, and trust their employees to choose the balance that empowers them and their team to achieve their goals.
Lead the design, implementation, and optimization of customer infrastructure and CI/CD pipelines.
Collaborate with cross-functional teams to ensure robust system performance, scalability, and security.
Mentor junior team members, drive automation initiatives, and contribute to strategic decisions regarding infrastructure and deployment processes.
Mirantis is the Kubernetes-native AI infrastructure company, enabling organizations to build and operate scalable, secure, and sovereign infrastructure for modern AI.
Play a crucial part in designing and scaling secure cloud infrastructure.
Lead the charge in intelligent automation systems and ensure robust deployment processes.
Collaborate with product, engineering, and leadership to drive company success.
Jobgether is a company that connects job seekers with employers. They utilize an AI-powered matching process to ensure applications are reviewed quickly and objectively.
Enabling customers' use of AWS to achieve their business objectives.
Automating cloud infrastructure with scripting and code.
Supporting developers in efficiently working within AWS.
Effectual is a professional services team that ensures customer-facing projects are delivered with exceptional customer satisfaction and technical excellence. Effectual DevOps Engineers are regarded as 'Brand Ambassadors' who stay current on leading practices to deliver high-quality solutions.
Architect and deploy secure, scalable infrastructure using Terraform, CloudFormation, or similar tools.
Ensure the platform meets strict SLA requirements for enterprise clients, minimizing downtime.
Implement comprehensive monitoring, logging, and alerting to provide deep visibility into system health.
Filevine provides cloud-based workflow tools for legal professionals, helping them manage organizations and serve clients. They are recognized as a fast-growing and innovative technology company with a team of passionate professionals.
Deploy and manage cloud infrastructure across all three clouds using Terraform IaC.
Architect, build, and maintain reliable CI/CD pipelines in Github Actions and ArgoCD.
Contribute to decisions around our departmental roadmap and project priorities.
Coalesce is the only data transformation and governance platform designed for the AI era, improving data professionals' lives since its founding in 2020.
Mirantis is looking for a talented Systems/DevOps engineer to join our product team and will be designing, implementing, deploying and testing cloud infrastructure products on top of open-source components. Deploy, test, and evaluate Mirantis K8S-related products. Manage and maintain bare metal test environments, ensuring optimal performance, security, and availability.
Mirantis is the Kubernetes-native AI infrastructure company, enabling organizations to build and operate scalable, secure, and sovereign infrastructure.
Lead maintenance and operations for production and development environments.
Architect and implement complex solutions spanning OS, virtualization, network, and cloud layers.
Lead automation initiatives for infrastructure provisioning and operational tasks.
NMI enables partners with choice in payments, challenging the one-size-fits-all approach. They power innovative tech for SMBs, entrepreneurs, and fintech startups, fostering a diverse and welcoming workplace with a dedicated Diversity, Equity & Inclusion action group.
Responsible for administration, support, troubleshooting and implementation of Azure DevOps.
Implement DevOps principles at an enterprise level and enable continuous integration and continuous delivery.
Streamline and optimize the application lifecycle, adding visibility to technical debt and increasing software delivery speed.
Lumicera Health Services is defining the “new norm” in specialty pharmacy to optimize patient well-being through our core principles of transparency and stewardship.
Run the production environment by monitoring availability and taking a holistic view of system health. Build software and systems to manage platform infrastructure and applications. Improve reliability, quality, and time-to-market of our suite of software solutions.
NICE software products are used by 25,000+ global businesses to deliver extraordinary customer experiences, fight financial crime and ensure public safety.