Architect and maintain scalable, reliable infrastructure: Design and optimize infrastructure for high availability, fault tolerance, and performance across distributed systems.
Lead incident management and root cause analysis: Own incident response processes, ensure swift resolution of issues, and drive post-incident improvements to prevent recurrences.
Service monitoring and automation: Build and maintain automated monitoring, alerting, and healing systems that improve system health, reduce manual intervention, and minimize downtime.
Design, implement, monitor and maintain Sysdig's Infrastructure at scale on different clouds and on-prem. Collaborate with development teams to improve system reliability, performance, and scalability. Participate in on-call rotation, respond to incidents, conduct root cause analyses, and implement preventive measures.
Sysdig helps organizations secure innovation in the cloud with runtime insights, open innovation, and agentic AI, trusted by over 60% of the Fortune 500.
Design, implement, and evolve large-scale, cloud-native infrastructure supporting MariaDB's global SaaS platform. Lead reliability and scalability initiatives, driving automation and resilience through infrastructure-as-code and GitOps practices. Proactively identify and remediate systemic reliability issues, ensuring high service availability and performance across multi-cloud environments.
MariaDB is making a big impact on the world and is the backbone of applications used everyday, including 75% of the Fortune 500 companies.
Lead maintenance and operations for production and development environments.
Architect and implement complex solutions spanning OS, virtualization, network, and cloud layers.
Lead automation initiatives for infrastructure provisioning and operational tasks.
NMI enables partners with choice in payments, challenging the one-size-fits-all approach. They power innovative tech for SMBs, entrepreneurs, and fintech startups, fostering a diverse and welcoming workplace with a dedicated Diversity, Equity & Inclusion action group.
Support the evolution of our platform by improving scalability, reliability, observability, and security. Proactively identify bottlenecks and unlock the autonomy of the entire engineering team. Maintain infrastructure & deployment pipelines and collaborate with engineering teams on architectural decisions and production-readiness practices.
Feegow joined the Docplanner Group, a health-tech company, in 2022 and is dedicated to developing innovative solutions for physicians and managers.
Design, scale, and operate resilient, cloud-native infrastructure in AWS with an emphasis on EKS, IAM, RBAC, and modern security-first practices.
Build and optimize CI/CD pipelines with GitHub Actions and GitHub Advanced Security enabling velocity without compromising safety.
Own observability across the stack using Datadog (metrics, logging, alerting, and tracing).
DexCare optimizes time in healthcare, streamlining patient access, reducing waits, and enhancing overall experiences. They are committed to creating an inclusive workplace where diversity drives innovation and belonging strengthens collaboration, enabling everyone to thrive.
Lead and mentor multiple teams across SRE, cloud infrastructure, and platform engineering functions.
Drive multi-team initiatives to deliver scalable, secure, and cost-efficient infrastructure leveraging AWS-native and serverless technologies.
Drive adoption of FinOps practices and partner with finance and product teams on budgeting and forecasting.
Model N is the leader in revenue optimization and compliance for pharmaceutical, medtech, and high-tech innovators. Model N is trusted by over 150 of the world’s leading companies across more than 120 countries.
As a Platform Engineer, enhance and maintain foundational tools and systems, working hands-on with Kubernetes clusters and AWS infrastructure. Build and maintain services that abstract and orchestrate our infrastructure, designing and implementing backend services like APIs and controllers. Develop software for complex projects, and manage infrastructure migrations and security tooling.
Monzo is on a mission to make money work for everyone, waving goodbye to the complicated ways of traditional banking, offering personal and business bank accounts.
Shape the way Scalable runs microservices in a performant, secure, and cost-efficient way. Collaborate with cross-functional teams to understand scalability requirements. Develop and maintain internal tooling around Monitoring, Developer Portal, and Load Testing.
Scalable Capital is a leading digital investment and banking platform with a full banking licence, empowering people across Europe to shape their own finances.
Design, deploy, and maintain cloud infrastructure solutions, adhering to security guidelines. Monitor cloud infrastructure and applications, addressing performance bottlenecks and security vulnerabilities. Implement automation tools/IaC to streamline provisioning and deployment of cloud resources.
Moniepoint is an all-in-one financial services platform for emerging markets and the second-fastest growing company in Africa.
Oversee the reliability, scalability, performance, and security of key production services.
Collaborate with cross-functional teams to develop and maintain resilient infrastructure.
Provide expert mentorship and guidance on best practices to engineers throughout the organization.
Cision is a global leader in PR, marketing and social media management technology and intelligence, helping brands and organizations connect with customers and stakeholders to drive business results. The company has offices in 24 countries throughout the Americas, EMEA and APAC.
Lead and Mentor a High-Performing Team: Hire, develop, and retain top engineering talent.
Develop the Strategic Roadmap: Define and execute the strategy for security infrastructure, automation, and operations.
Oversee Secure and Resilient Infrastructure: Guide the architectural design and implementation of secure, scalable, and highly available infrastructure in our multi-cloud (predominantly AWS) environment.
Smartsheet helps people and teams achieve anything with seamless work management and smart, scalable solutions. They build tools that empower teams to automate the manual, uncover insights, and scale smarter; they welcome diverse perspectives and non-traditional paths.
Own challenging infrastructure problems end-to-end by understanding how engineers use the platform.
Design scalable, maintainable services and contribute to technical proposals.
Contribute to the roadmap, highlighting opportunities, validating approaches and helping keep our platform solutions current with cloud best practices.
Canva's intuitive suite of design products is powered by our large distributed infrastructure group, setting large and ambitious goals.
Design and implement cloud-native infrastructure that powers core product capabilities at scale.
Build proprietary solutions (sync engines, observability pipelines, DNS management systems) that differentiate Files.com.
Engineer infrastructure for speed, resilience, and maintainability across high-volume, distributed workloads.
Files.com powers secure file transfer and automation for over 4,000 brands. They are a profitable, founder-led SaaS company with a flat, high-trust engineering organization, where engineers are empowered to take ownership of projects.
Lead capacity planning, autoscaling, and performance optimization across our application.
Define and enforce best practices for scalability, reliability, observability, and infrastructure resilience.
Conduct architectural reviews and propose improvements to enhance performance and cost efficiency.
Hypori Inc., a leading provider of SaaS cybersecurity solutions, is a disruptive technology company transforming secure mobility for government and commercial customers.
Lead and manage the Platform Engineering team, providing technical guidance and mentorship. Design, build, and evangelize Golden Paths and Service Scaffolding to reduce friction across the development lifecycle. Oversee the design, implementation, and maintenance of Shared DB Platforms, ensuring optimal performance, integrity, and security across the organization.
Founded in 2012, EasyPost is a YC unicorn whose mission is to make shipping simple for businesses from garage startups to the Fortune 500.
Lead the design, implementation, and continuous improvement of our cloud-native platform infrastructure.
Create and maintain tooling and automation that improves efficiency and developer experience.
Drive platform optimization initiatives focused on performance, cost efficiency, and reliability.
Intelerad's medical imaging solutions streamline the flow of information, simplifying complex processes, maximizing efficiencies, and shining a light on the unknown.
Run the production environment by monitoring availability and taking a holistic view of system health. Build software and systems to manage platform infrastructure and applications. Improve reliability, quality, and time-to-market of our suite of software solutions.
NICE software products are used by 25,000+ global businesses to deliver extraordinary customer experiences, fight financial crime and ensure public safety.
As an SRE you will be responsible for ensuring the availability, performance and cost effectiveness of these services. You will be working with multiple feature development teams and the BAU/Support team to define and evolve our cloud & on-prem infrastructure & delivery pipelines, improving system observability. Proactively identifying and mitigating reliability risks.
In 2019, our founders were working as engineers solving complex cross domain problems within government organisations TwinStream was formed.