Source Job

Global

  • Make high-quality, data-driven decisions on building the next generation of our production platform and deliver results.
  • Own how we test, build, and deploy code in a high-scale PaaS environment, collaborating across the company on design and technology choices.
  • Blaze a trail as part of a small platform engineering team, driving reliability practices and directly influencing what we work on and how we work.

Python Golang Kubernetes CI/CD Observability

20 jobs similar to Senior Software Engineer

Jobs ranked by similarity.

United States

  • Design and build core platform infrastructure for large-scale cloud-native data and analytics systems.
  • Own and improve CI/CD pipelines, testing frameworks, and deployment in a high-scale PaaS environment.
  • Contribute to reliability engineering, observability, and operational excellence across distributed systems.

Jobgether uses an AI-powered matching process to connect candidates with roles. The company is a growing platform focused on efficient job matching and data privacy compliance.

Canada Unlimited PTO

  • Design, build, and operate distributed systems powering observability across ClickHouse Cloud.
  • Own reliability, performance, and cost-efficiency of the telemetry pipeline and storage systems.
  • Take part in on-call rotation and drive root-cause resolution and long-term fixes.

ClickHouse is a real-time analytics and data warehousing company recognized on the 2025 Forbes Cloud 100 list. With over 3,000 customers and rapid growth, the company fosters an innovative and fast-paced culture.

US Unlimited PTO

  • Design and build cloud-native infrastructure for reliability, observability, and automation across GCP, GKE, and Cloud Run.
  • Own incident response, root cause analysis, escalation workflows, and runbooks to prevent hard problems from recurring.
  • Develop Infrastructure as Code, CI/CD pipelines, and operational tooling to improve developer velocity and platform efficiency.

CertifyOS is building the data infrastructure that powers modern healthcare, automating provider licensing, enrollment, credentialing, and network monitoring through an API-first platform. The company is backed by leading investors with a team of deep experience in provider data systems, valuing authenticity, accountability, collaboration, results, and openness to feedback.

US Unlimited PTO

  • Provide frontline technical expertise to help developers deploy and scale Temporal in cloud-native environments.
  • Troubleshoot complex infrastructure issues, optimize performance, and develop automation solutions.
  • Collaborate with engineering and product teams to influence platform improvements and enhance developer experience.

Temporal provides an open source programming model that simplifies code and makes applications more reliable. The company is a growing team driven by values of curiosity, collaboration, and humility, focused on improving developer experience.

Canada

  • Build and maintain infrastructure platforms for over 200 backend services running on Kubernetes clusters with 40,000+ cores.
  • Lead and mentor other engineers, own complex infrastructure failures, and participate in a shared on-call rotation.
  • Drive cloud cost efficiency, estimate schedules, and use AI tools as a first-class collaborator in daily workflows.

Life360's mission is to keep people close to the ones they love through location sharing, safe driver reports, and crash detection. The company serves approximately 97.8 million monthly active users across more than 180 countries and has more than 500 remote-first employees.

Europe

  • Design and operate our Kubernetes ecosystem with a focus on high availability and zero-downtime operations.
  • Own and evolve our PaaS strategy, using GitOps and CI/CD to empower domain teams to deploy independently.
  • Define and implement our observability strategy across metrics, logs, and tracing.

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs. They offer an all-in-one financial B2B solution integrating banking, accounting, financial management, and invoicing into a mobile-first platform, with about 346 million in funding.

$130,000–$142,000/yr
US UK

  • Contribute designs, code, tests, code reviews, and excellent judgment towards the development and continuous improvement of our digital platforms.
  • Participate in agile ceremonies and evolving development practices of the team.
  • Provide stewardship of the long-term sustainability of our platform and actively manage platform health and technical debt.

PLOS is a nonprofit, Open Access publisher that empowers researchers to accelerate progress in science and medicine by leading a transformation in research communication. They are supported by a highly skilled global in-house team, partnerships with local scholarly organizations, and the valued contributions of a diverse, international community of scientific researchers.

United States 6w PTO

  • Build and operate the internal engineering platform that provides application engineers with the tools, systems, and Kubernetes clusters they need to deploy and run their workloads.
  • Focus on cloud infrastructure, capacity management, security, engineering productivity, monitoring, and US Federal compliance across squads.
  • Participate in on-call rotations to ensure the health of the system and understand how people use our products.

Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. We are a 100% remote company with 1,600+ team members across 40+ countries, backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital.

US

  • Spearheads evolution of compute and data delivery services with an emphasis on scale and user requirements
  • Collaborates to enable efficient and rapid access to new and growing data sets
  • Improves reliability and scalability by resolving edge cases, studying failure modes, and writing tests

Planet designs, builds, and operates the largest constellation of imaging satellites, delivering an unprecedented dataset via a cloud-based platform. With a global team and a people-centric approach, the company focuses on culture and community while preparing for growth.

US

  • Design, deploy, and manage production Kubernetes clusters with workload scheduling, resource quotas, network policies, and RBAC.
  • Build and optimize CI/CD pipelines using Infrastructure as Code and GitOps principles.
  • Implement observability solutions using Prometheus, Grafana, and OpenTelemetry for performance tuning and reliability.

VerTALENTS is a subsidiary of VerSprite Cybersecurity, specializing in technology staffing. The company connects top technical talent with industry clients through various methods, adding value to both clients and candidates for full-time and contracting opportunities.

US 5w PTO

  • Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
  • Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
  • Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.

Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.

$190,800–$267,100/yr
US

  • Design and build backend systems, APIs, infrastructure, and platform capabilities that improve developer workflows across Reddit.
  • Build scalable and reliable systems across both AI-powered developer workflows and the core non-AI systems engineers rely on every day.
  • Lead high-impact projects across Reddit’s developer tooling ecosystem by writing and reviewing code and design docs, aligning stakeholders, and making pragmatic technical tradeoffs.

Reddit is a community-based platform built on shared interests, passion, and trust, facilitating open and authentic conversations. With over 100,000 active communities and approximately 126 million daily active unique visitors, it serves as one of the internet’s largest sources of information.

US

  • Lead design and operation of internal developer platforms and self-service infrastructure.
  • Build and optimize CI/CD pipelines, deployment workflows, and automation across GitHub Actions, Jenkins, ArgoCD.
  • Apply SRE principles to improve developer-facing systems and software delivery performance.

Versant is a media company owning iconic brands in news, sports, and entertainment, including USA Network, Fandango, and Rotten Tomatoes. It is an independent, publicly traded company with a collaborative, inclusive culture and a remote-first work environment.

Global

  • Design and evolve cloud-native, containerized infrastructure for data products and services.
  • Lead cross-functional technical initiatives ensuring availability, security, scalability, and reliability.
  • Contribute hands-on expertise in systems design, automation, and high-scale distributed systems.

Visa is a world leader in payments technology, facilitating transactions across more than 200 countries. A large global company, Visa focuses on innovation and uplifting everyone, everywhere.

US

  • Take ownership of incident management and operational excellence across cloud infrastructure.
  • Automate high-risk manual processes and drive reliability gains through engineering.
  • Own a platform domain such as Temporal, observability, or Kubernetes operations.

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London with offices across Europe and the US, and has over $530 million in funding from premier investors like Accel and Nvidia's VC arm.

Brazil

  • Design and improve cloud architectures, deployment pipelines, and infrastructure systems for large-scale applications across multiple cloud environments.
  • Collaborate with engineering, product, and platform teams to ensure infrastructure reliability, scalability, security, and operational excellence.
  • Drive engineering best practices, contribute to architecture decisions, and participate in on-call rotations for production support.

We are a multinational team that believes technology solves business challenges. Since 2016, we have been helping customers translate technology into success, combining Latin American talent with Swiss organization.

SRE

Fal
$180,000–$250,000/yr
US

  • Own and operate our Kubernetes infrastructure.
  • Build and maintain CI/CD pipelines and deployment infrastructure.
  • Leverage AI to automate analysis and resolution of production issues.

Fal is the generative media ecosystem powering the next generation of AI products. They build the infrastructure, tools, and model access that teams need to move from idea to production.

Germany

  • Build and maintain end-to-end observability with ELK, Prometheus, and Grafana.
  • Own and improve CI/CD pipelines (CircleCI, GitLab CI, GitHub Actions, ArgoCD).
  • Lead incident response and postmortems in a blameless culture.

Redcare Pharmacy is Europe’s No.1 e-pharmacy, powered by passionate teams and cutting-edge innovation. They strive to create a healthy, collaborative work environment where every employee feels valued and inspired to contribute to their vision “Until every human has their health”.

United States Unlimited PTO

  • Own full-stack design and delivery of platform capabilities from architecture to deployment and observability.
  • Build open source infrastructure packages for airgap and cloud-native environments and write comprehensive tests.
  • Work directly with product and customers to translate mission problems into platform capabilities and mentor team members.

Defense Unicorns delivers mission value by streamlining software delivery for defense and civil agencies, focusing on speed, security, and optionality. The team includes innovators, software engineers, and veterans with decades of experience delivering technology programs across the federal market.

US

  • Implement highly available, scalable infrastructure across AWS, GCP, and bare-metal environments.
  • Drive an "automation-first" culture by writing code in Python/Go to build self-healing systems.
  • Act as lead Incident Commander, develop response playbooks, and conduct post-incident analyses.

Zscaler accelerates digital transformation to secure customers with a cloud-native Zero Trust Exchange platform. The company processes over 200 billion transactions daily and fosters a culture of execution, collaboration, and accountability.