Lead reliability initiatives across multiple Ads domains including ad serving, auctions, targeting, reporting, measurement, and billing.
Partner with engineering leadership to improve reliability, scalability, operational excellence, and engineering efficiency across the Ads organization.
Design and build platforms, tooling, and automation that improve reliability and developer productivity at scale.
Reddit is a community of communities, built on shared interests, passion, and trust, home to the most open and authentic conversations on the internet. With 100,000+ active communities and approximately 126 million daily active unique visitors, it is one of the internet's largest sources of information.
Architect for Scale, partnering with product and infrastructure teams to design highly available systems.
Drive Automation to eliminate repetitive operational work through tooling and systems.
Reddit is a community-based platform where users submit, vote, and comment on various topics. It hosts over 100,000 active communities and attracts millions of daily active users, making it one of the largest and most influential internet platforms.
Collaborate with service teams to define SLIs and SLOs based on customer experience and build error budget policies that influence engineering decisions.
Own the Operational Readiness Review process, conducting reviews for new services and major changes across observability, alerting, runbooks, capacity, and graceful degradation.
Act as a reliability expert for architecture reviews, failure mode analysis, dependency mapping, and resilience design.
Supabase provides the Postgres development platform with a complete backend solution including Database, Auth, Storage, Edge Functions, Realtime, and Vector Search. With 280+ team members across 55+ countries, they are an open-source-first company that values async work and has raised $500M.
Implement highly available, scalable infrastructure across AWS, GCP, and bare-metal environments.
Drive an "automation-first" culture by writing code in Python/Go to build self-healing systems.
Act as lead Incident Commander, develop response playbooks, and conduct post-incident analyses.
Zscaler accelerates digital transformation to secure customers with a cloud-native Zero Trust Exchange platform. The company processes over 200 billion transactions daily and fosters a culture of execution, collaboration, and accountability.
Lead the Site Reliability Operations team, overseeing observability, monitoring, incident response, and operational excellence for key enterprise services.
Partner with product, engineering, and infrastructure teams to embed CI/CD and release best practices, automating build/test/deploy and release monitoring.
Own problem management, driving root cause analysis and corrective actions to improve system resilience and reduce incident impact.
Mercury Insurance helps people reduce risk and overcome unexpected events, serving customers for over 60 years. They are a midsize employer recognized as one of America's Best Midsize Employers for 2026, with a collaborative culture focused on growth and inclusion.
Act as a first responder for system incidents and outages, ensuring high availability and performance.
Own and evolve monitoring, alerting, and log management systems while optimizing database infrastructure.
Collaborate with engineering teams to build scalable, resilient systems and contribute to SRE tooling and automation.
Circle is building the world's leading all-in-one platform for online communities. We're a fully remote company of around 200 team members from 30+ countries, with a culture that values autonomy, async collaboration, and high expectations.
Lead a high-impact CloudOps and infrastructure engineering team powering large-scale, real-time advertising systems under extreme performance and reliability constraints.
Own planning and delivery processes including sprint planning, backlog prioritization, execution tracking, and team retrospectives.
Drive initiatives to improve system reliability, observability, deployment safety, incident response, and production readiness.
Jobgether uses an AI-powered matching process to review applications quickly, objectively, and fairly against role requirements. Their platform identifies top-fitting candidates and shares shortlists directly with hiring companies.
Gain deep understanding of ad technology and industry standards like OpenRTB, MRAID, VAST.
Analyze and improve end-to-end ad workflows, reproducing problematic cases and proposing improvements.
Lead implementation of architectural solutions and provide hands-on technical leadership.
RTB House is a global marketing technology company providing AI-powered ad-buying solutions. The company processes over 20 million requests per second and fosters a culture of technical excellence and ownership.
Lead a team of software engineers building scalable services, APIs, and SDKs for our digital merchandising platform.
Drive architecture and design decisions, set technical direction, and work cross-functionally with product managers and data scientists.
Own and evolve engineering standards, manage team growth through 1:1s and performance feedback, and partner on hiring.
Jane Technologies is an MIT-founded eCommerce company in the cannabis industry, connecting consumers with local dispensaries and brands. We are a small close-knit team of highly technical engineers with diverse backgrounds, rapidly growing 20% month over month and valuing lean development and data-driven practices.
Take ownership of incident management and operational excellence across cloud infrastructure.
Automate high-risk manual processes and drive reliability gains through engineering.
Own a platform domain such as Temporal, observability, or Kubernetes operations.
Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London with offices across Europe and the US, and has over $530 million in funding from premier investors like Accel and Nvidia's VC arm.
Design and build core platform infrastructure for large-scale cloud-native data and analytics systems.
Own and improve CI/CD pipelines, testing frameworks, and deployment in a high-scale PaaS environment.
Contribute to reliability engineering, observability, and operational excellence across distributed systems.
Jobgether uses an AI-powered matching process to connect candidates with roles. The company is a growing platform focused on efficient job matching and data privacy compliance.
Manage a scrum team of 4-6 engineers building and operating high-volume bidder systems.
Oversee AWS-based cloud infrastructure processing over 1 billion HTTP requests per hour.
Drive improvements in reliability, performance, and cost efficiency across production systems.
Jamloop builds high-scale advertising technology for real-time bidding systems. We are a remote-first company focused on reliability and operational excellence.
Act as a trusted technical advisor and solutions architect for strategic partners, guiding them through the integration and adoption of Reddit’s advertising APIs and tools.
Design and prototype proof-of-concept solutions that accelerate technical enablement and create tangible value for partners.
Collaborate directly with engineering teams at partner organizations to optimize performance, address integration gaps, and troubleshoot complex issues.
Reddit is a community-based platform built on shared interests and open conversations. It hosts over 100,000 active communities and sees approximately 126 million daily active unique visitors, making it one of the internet’s largest sources of information.
Design and build backend systems, APIs, infrastructure, and platform capabilities that improve developer workflows across Reddit.
Build scalable and reliable systems across both AI-powered developer workflows and the core non-AI systems engineers rely on every day.
Lead high-impact projects across Reddit’s developer tooling ecosystem by writing and reviewing code and design docs, aligning stakeholders, and making pragmatic technical tradeoffs.
Reddit is a community-based platform built on shared interests, passion, and trust, facilitating open and authentic conversations. With over 100,000 active communities and approximately 126 million daily active unique visitors, it serves as one of the internet’s largest sources of information.
Define and drive the technical roadmap for platform infrastructure and scalability.
Design and review complex system architectures that support company growth.
Mentor engineers and provide technical guidance to elevate engineering capabilities.
PadSplit operates a marketplace for affordable shared housing. It is a growing company with an engineering-focused culture that values collaboration and technical excellence.
Build internal tooling to help other engineers and the rest of the company understand and operate our system.
Design and implement security best practices for our team and infrastructure.
Reduce toil through automation, including building and maintaining CI/CD infrastructure.
Openly is rebuilding insurance from the ground up by re-envisioning and enhancing every aspect of the customer experience. They are a rapidly growing team of exceptional, curious, empathetic people with a wide range of skill sets, spanning many departments.