Source Job

US

  • Design, create, and maintain software and systems to improve the availability, scalability, and efficiency of Thumbtack's services.
  • Set the architectural direction of infrastructure and platform services while supporting the engineering organization.
  • Troubleshoot and debug critical systems throughout the SDLC.

AWS Linux Python Go PHP Javascript

20 jobs similar to Sr. Software Engineer, Site Reliability Engineer

Jobs ranked by similarity.

Canada

  • Design, create, and maintain software and systems to improve the availability, scalability, and efficiency of Thumbtack's services
  • Set the architectural direction of infrastructure and platform services while supporting the engineering organization
  • Design and implement tools and processes used for deployment, change, service, and infrastructure management

Thumbtack helps millions of people confidently care for their homes through personalized guidance, AI tools, and a hiring experience. They have a growing community of 300,000 local service businesses.

UK

Run the production environment by monitoring availability and taking a holistic view of system health. Build software and systems to manage platform infrastructure and applications. Improve reliability, quality, and time-to-market of our suite of software solutions.

NICE software products are used by 25,000+ global businesses to deliver extraordinary customer experiences, fight financial crime and ensure public safety.

Germany

Shape the way Scalable runs microservices in a performant, secure, and cost-efficient way. Collaborate with cross-functional teams to understand scalability requirements. Develop and maintain internal tooling around Monitoring, Developer Portal, and Load Testing.

Scalable Capital is a leading digital investment and banking platform with a full banking licence, empowering people across Europe to shape their own finances.

$219,000–$245,000/yr
US Unlimited PTO

  • Architect, operate, improve and secure the platform the Garner Health app runs on
  • Boost development velocity and productivity
  • Build systems to a high engineering standard and hold others to the same high standard

Garner has developed a revolutionary approach to evaluating doctor performance and a unique incentive model that's reshaping the healthcare economy to ensure everyone can afford high quality care. They have more than doubled their revenue annually over the last 5 years. Garner's award winning culture is designed to cultivate teamwork, trust, autonomy, exceptional results, and individual growth.

$140,000–$190,000/yr
US Canada Unlimited PTO

  • Architect and maintain scalable, reliable infrastructure: Design and optimize infrastructure for high availability, fault tolerance, and performance across distributed systems.
  • Lead incident management and root cause analysis: Own incident response processes, ensure swift resolution of issues, and drive post-incident improvements to prevent recurrences.
  • Service monitoring and automation: Build and maintain automated monitoring, alerting, and healing systems that improve system health, reduce manual intervention, and minimize downtime.

VGS is the world's leader in payment tokenization, empowering clients and partners by tokenizing sensitive payment data and limiting compliance scope. They embed a universal token vault into their technology stack to manage the complexities of payment data tokenization across processors and networks and more. While the job posting doesn't specify size, they appear to have a culture that values transparency, collaboration, grit, and humility.

4w PTO

As a Senior Software Engineer, Enterprise Platform at Vanta, you will build and operate systems that power Vanta’s FedRAMP environments, including automated release, vulnerability remediation, and evidence generation pipelines that meet strict compliance timelines. You will also define and evolve Vanta’s production reliability framework, including SLOs, incident response patterns, observability standards, service catalog, metrics dashboards, and the Vanta SLA definition. You will identify and solve complex scalability and performance challenges, particularly related to service reliability and data throughput.

Vanta helps businesses earn and prove trust by empowering companies to practice better security and prove it with ease.

Americas EMEA Unlimited PTO

  • Design and implement highly scalable infrastructure for GitLab.com to support current and future growth.
  • Collaborate with cross-functional teams across the Infrastructure organization to plan and deliver projects that shape GitLab’s platform direction.
  • Operate and improve edge services and Kubernetes workloads, acting as a subject matter expert within the infrastructure department.

GitLab is an open-core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. They aim to enable everyone to contribute to and co-create the software that powers our world.

US

  • Designs, implements, and continuously improves observability strategies across services.
  • Focuses on understanding system behavior in production, identifying failure modes, performance bottlenecks, and reliability risks.
  • Evolves and maintains shared AWS CDK and CDK8s constructs, with emphasis on observability, autoscaling, and operational safeguards.

Truelogic is a leading provider of nearshore staff augmentation services. They have a team of 600+ highly skilled tech professionals based in Latin America, partnering with U.S. companies on impactful projects and valuing expertise and aspirations.

$95,000–$175,000/yr

  • Provide architecture plans for multiple cloud-based applications supporting stakeholders.
  • Analyze performance and ensure applications meet the scalability and reliability needs of internal teams.
  • Identify and troubleshoot performance bottlenecks and reliability issues across the stack.

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster.

Australia

  • Contribute to solving engineering problems at scale and developing high-quality, scalable software.
  • Focus on architecture and implementation of distributed systems, and refactoring complex systems.
  • Make code design decisions and contribute to technical solution designs, exercising autonomy over your work.

Xero provides a cloud-based accounting software platform for small businesses. The company values strong collaboration and has a culture-focused team.

$120,032–$164,368/yr
UK

As a Platform Engineer, enhance and maintain foundational tools and systems, working hands-on with Kubernetes clusters and AWS infrastructure. Build and maintain services that abstract and orchestrate our infrastructure, designing and implementing backend services like APIs and controllers. Develop software for complex projects, and manage infrastructure migrations and security tooling.

Monzo is on a mission to make money work for everyone, waving goodbye to the complicated ways of traditional banking, offering personal and business bank accounts.

US

  • Architect and deploy secure, scalable infrastructure using Terraform, CloudFormation, or similar tools.
  • Ensure the platform meets strict SLA requirements for enterprise clients, minimizing downtime.
  • Implement comprehensive monitoring, logging, and alerting to provide deep visibility into system health.

Filevine provides cloud-based workflow tools for legal professionals, helping them manage organizations and serve clients. They are recognized as a fast-growing and innovative technology company with a team of passionate professionals.

$198,900–$312,550/yr

  • Design, implement, and operate services to strengthen the enterprise readiness of our cloud.
  • Collaborate with other developers to write the best code for the project and deliver amazing results.
  • Work across the ~70 member development team building products and solutions for large enterprise customers.

Atlassian's software products help teams all over the planet and their solutions are designed for all types of work. They believe that the unique contributions of all Atlassians create their success and strive to unleash the potential of every team.

$189,592–$220,000/yr

  • Responsible for custom architectural design, implementation, monitoring, and maintenance for production application environments.
  • Work with the Principal Software Engineer on technical architecture and design based on customer product requirements.
  • Hands-on commissioning, configuration, administration, documentation, and support for all on-prem & cloud (AWS) environments.

NBCUniversal is one of the world's leading media and entertainment companies creating world-class content across film, television, streaming, theme parks, and consumer experiences. They own leading entertainment and news brands and are a subsidiary of Comcast Corporation, committed to improving communities and fostering an inclusive culture.

US

Shape and scale critical infrastructure for one of the largest online platforms in the world. Build, maintain, and optimize multi-cloud compute systems for high-performance, reliable, and secure operations. Influence the technical direction of infrastructure platforms while mentoring and guiding other engineers.

This position is posted by Jobgether on behalf of a partner company.

US Canada Unlimited PTO

  • Lead the design and implementation of features for our Cloud Operational API.
  • Drive architectural discussions and set direction for the scalability and reliability of our services.
  • Take ownership across the lifecycle of services from design to operations.

Temporal simplifies code and makes applications more reliable. They are building a team to be the reliable foundation of every developer’s toolbox. They value curiosity, drive, collaboration, genuineness and humility and are looking for those who share those values.

$150,100–$188,100/yr
US Canada 2w PTO 12w maternity 12w paternity

  • Create and test reliable cloud infrastructure services that support Webflow’s range of products.
  • Balance reliability, scalability, and cost efficiency concerns while refactoring and modernizing existing services.
  • Collaborate with product engineering teams to deliver new solutions for services and ways of working that might not exist yet.

Webflow is the leading visual development platform for building powerful websites without writing code.

$49,293–$73,940/yr

  • Create architecture and non-functional design for Import/Export services that meet SLO/SLA targets
  • Decompose complex problems into resilient, cloud-native designs
  • Advise and mentor team members on code quality, technical debt management, and best practices

Relativity is solving big data challenges in the legal tech industry. They believe that employees are happiest when they're empowered to be their full, authentic selves.

Europe

Lead the Reliability & Operations function within the Developer & Production Enablement (DPE) division of RWS’s Product & Technology organization. Take ownership of global production operations and lead the transition from manual, ticket-based workflows to platform-integrated automation. Ensure stability today, while designing for scalability and autonomy in the future.

RWS's purpose is to unlock global understanding, valuing every language and culture, and celebrating diversity and inclusion to make the company strong.

$167,800–$246,700/yr
US

  • Build and own product capabilities for Stytch’s agentic identity platform on Twilio.
  • Design, implement, and maintain scalable, reliable distributed services, optimizing for security, latency, and developer experience.
  • Partner with Product and Engineering leadership to set direction, translate customer needs into technical plans, and deliver high-impact roadmap features.

Twilio is shaping the future of communications from home. They deliver innovative solutions to hundreds of thousands of businesses and empower millions of developers worldwide to craft personalized customer experiences.