Source Job

20 jobs similar to (Senior) Cloud Site Reliability Engineer (Scalability) (m/f/x)

Jobs ranked by similarity.

India Unlimited PTO

Seeking an experienced Site Reliability Engineer to help build highly resilient and scalable systems by automating, measuring, and monitoring everything. Implement highly-available and scalable architectures for core and third-party components of Acquia Source. Implement metrics, monitoring, and incident response processes.

Acquia is an open source digital experience company providing technology to brands that allows them to embrace innovation and create customer moments that matter.

UK

Run the production environment by monitoring availability and taking a holistic view of system health. Build software and systems to manage platform infrastructure and applications. Improve reliability, quality, and time-to-market of our suite of software solutions.

NICE software products are used by 25,000+ global businesses to deliver extraordinary customer experiences, fight financial crime and ensure public safety.

$140,000–$190,000/yr
US Canada Unlimited PTO

  • Architect and maintain scalable, reliable infrastructure: Design and optimize infrastructure for high availability, fault tolerance, and performance across distributed systems.
  • Lead incident management and root cause analysis: Own incident response processes, ensure swift resolution of issues, and drive post-incident improvements to prevent recurrences.
  • Service monitoring and automation: Build and maintain automated monitoring, alerting, and healing systems that improve system health, reduce manual intervention, and minimize downtime.

VGS is the world's leader in payment tokenization, empowering clients and partners by tokenizing sensitive payment data and limiting compliance scope. They embed a universal token vault into their technology stack to manage the complexities of payment data tokenization across processors and networks and more. While the job posting doesn't specify size, they appear to have a culture that values transparency, collaboration, grit, and humility.

$174,600–$220,000/yr
US

  • Lead capacity planning, autoscaling, and performance optimization across our application.
  • Define and enforce best practices for scalability, reliability, observability, and infrastructure resilience.
  • Conduct architectural reviews and propose improvements to enhance performance and cost efficiency.

Hypori Inc., a leading provider of SaaS cybersecurity solutions, is a disruptive technology company transforming secure mobility for government and commercial customers.

ANZ

  • Own challenging infrastructure problems end-to-end by understanding how engineers use the platform.
  • Design scalable, maintainable services and contribute to technical proposals.
  • Contribute to the roadmap, highlighting opportunities, validating approaches and helping keep our platform solutions current with cloud best practices.

Canva's intuitive suite of design products is powered by our large distributed infrastructure group, setting large and ambitious goals.

Design, implement, monitor and maintain Sysdig's Infrastructure at scale on different clouds and on-prem. Collaborate with development teams to improve system reliability, performance, and scalability. Participate in on-call rotation, respond to incidents, conduct root cause analyses, and implement preventive measures.

Sysdig helps organizations secure innovation in the cloud with runtime insights, open innovation, and agentic AI, trusted by over 60% of the Fortune 500.

Contribute to AWS Architecture by designing and refining our AWS stack, ensuring scalability, reliability, and cost-effectiveness. Implement Infrastructure Automation using Terraform, CloudFormation, or similar tooling to automate deployments and enforce consistency across environments. Support Developer Experience by helping establish and maintain efficient CI/CD pipelines, streamlining deployments and enhancing overall developer productivity.

Oddin.gg is a leading B2B provider of esports betting solutions with a comprehensive ecosystem that helps their partners grow.

$150,100–$188,100/yr
US Canada 2w PTO 12w maternity 12w paternity

  • Create and test reliable cloud infrastructure services that support Webflow’s range of products.
  • Balance reliability, scalability, and cost efficiency concerns while refactoring and modernizing existing services.
  • Collaborate with product engineering teams to deliver new solutions for services and ways of working that might not exist yet.

Webflow is the leading visual development platform for building powerful websites without writing code.

Europe

As an SRE you will be responsible for ensuring the availability, performance and cost effectiveness of these services. You will be working with multiple feature development teams and the BAU/Support team to define and evolve our cloud & on-prem infrastructure & delivery pipelines, improving system observability. Proactively identifying and mitigating reliability risks.

In 2019, our founders were working as engineers solving complex cross domain problems within government organisations TwinStream was formed.

Nigeria

Design, deploy, and maintain cloud infrastructure solutions, adhering to security guidelines. Monitor cloud infrastructure and applications, addressing performance bottlenecks and security vulnerabilities. Implement automation tools/IaC to streamline provisioning and deployment of cloud resources.

Moniepoint is an all-in-one financial services platform for emerging markets and the second-fastest growing company in Africa.

$120,000–$205,000/yr
US

  • Dive into client environments to explore application workloads, infrastructure dependencies, and security controls.
  • Aid in the design and implement migration strategies to reduce risks and unlock automation opportunities.
  • Develop scalable and secure infrastructure using Infrastructure as Code (IaC) tools.

Kunai builds full-stack technology solutions for banks, credit and payment networks, infrastructure providers, and their customers.

Brazil 26w maternity 4w paternity

Support the evolution of our platform by improving scalability, reliability, observability, and security. Proactively identify bottlenecks and unlock the autonomy of the entire engineering team. Maintain infrastructure & deployment pipelines and collaborate with engineering teams on architectural decisions and production-readiness practices.

Feegow joined the Docplanner Group, a health-tech company, in 2022 and is dedicated to developing innovative solutions for physicians and managers.

Canada 5w PTO

Design, implement, and evolve large-scale, cloud-native infrastructure supporting MariaDB's global SaaS platform. Lead reliability and scalability initiatives, driving automation and resilience through infrastructure-as-code and GitOps practices. Proactively identify and remediate systemic reliability issues, ensuring high service availability and performance across multi-cloud environments.

MariaDB is making a big impact on the world and is the backbone of applications used everyday, including 75% of the Fortune 500 companies.

$198,900–$312,550/yr

  • Design, implement, and operate services to strengthen the enterprise readiness of our cloud.
  • Collaborate with other developers to write the best code for the project and deliver amazing results.
  • Work across the ~70 member development team building products and solutions for large enterprise customers.

Atlassian's software products help teams all over the planet and their solutions are designed for all types of work. They believe that the unique contributions of all Atlassians create their success and strive to unleash the potential of every team.

$110,000–$250,000/yr
US 4w PTO

  • Design and implement cloud-native infrastructure that powers core product capabilities at scale.
  • Build proprietary solutions (sync engines, observability pipelines, DNS management systems) that differentiate Files.com.
  • Engineer infrastructure for speed, resilience, and maintainability across high-volume, distributed workloads.

Files.com powers secure file transfer and automation for over 4,000 brands. They are a profitable, founder-led SaaS company with a flat, high-trust engineering organization, where engineers are empowered to take ownership of projects.

US

Lead and manage the Platform Engineering team, providing technical guidance and mentorship. Design, build, and evangelize Golden Paths and Service Scaffolding to reduce friction across the development lifecycle. Oversee the design, implementation, and maintenance of Shared DB Platforms, ensuring optimal performance, integrity, and security across the organization.

Founded in 2012, EasyPost is a YC unicorn whose mission is to make shipping simple for businesses from garage startups to the Fortune 500.

$120,032–$164,368/yr
UK

As a Platform Engineer, enhance and maintain foundational tools and systems, working hands-on with Kubernetes clusters and AWS infrastructure. Build and maintain services that abstract and orchestrate our infrastructure, designing and implementing backend services like APIs and controllers. Develop software for complex projects, and manage infrastructure migrations and security tooling.

Monzo is on a mission to make money work for everyone, waving goodbye to the complicated ways of traditional banking, offering personal and business bank accounts.

$95,000–$110,000/yr
US

  • Become a member of a highly collaborative engineering team offering a unique blend of Cloud Infrastructure Administration, Site Reliability Engineering, Security Operations, and Vulnerability Management.
  • Coordinate with client product teams, engineering team members, and other stakeholders to monitor and maintain a secure and resilient cloud-hosted infrastructure to established SLAs.
  • Innovate and implement using automated orchestration and configuration management techniques.

Coalfire is on a mission to make the world a safer place by solving our clients’ toughest cybersecurity challenges.

$115,000–$130,000/yr

Design, develop, and deliver high-quality software iteratively and incrementally. Take ownership of key components and services—from hands-on coding to deployment and monitoring. Mentor and support junior engineers through pairing, code reviews, and knowledge sharing.

Best Egg is a market-leading, tech-enabled financial platform helping people build financial confidence through a variety of installment lending solutions and financial health tools.

US

  • Designs, implements, and continuously improves observability strategies across services.
  • Focuses on understanding system behavior in production, identifying failure modes, performance bottlenecks, and reliability risks.
  • Evolves and maintains shared AWS CDK and CDK8s constructs, with emphasis on observability, autoscaling, and operational safeguards.

Truelogic is a leading provider of nearshore staff augmentation services. They have a team of 600+ highly skilled tech professionals based in Latin America, partnering with U.S. companies on impactful projects and valuing expertise and aspirations.