We are looking for a FinOps engineer that packs the technical chops of an SRE, but brings experience with cloud cost management & capacity planning.
Someone technical enough that engineers trust their architectural advice, but commercially minded enough to partner with Finance and explain the why behind our spend.
We need proactive people that can fully own projects and get them done, and know to get help when needed.
Cloud cost optimization – identify waste, drive rightsizing, build tooling and guardrails to prevent cost regressions.
Platform reliability and scalability – improve observability, define SLOs where they're missing, and harden the systems all of Stream's products depend on.
Architecture and infra evolution – evaluate and drive decisions on Kubernetes adoption, database architecture, and cloud provider strategy.
Stream powers real-time Chat, Video, Activity Feeds, and AI Moderation for billions of end-users across thousands of apps. Their platform processes billions of API requests per month and supports applications with millions of concurrent users, delivering highly reliable, low-latency services and a great developer experience.
Lead the Infrastructure Engineering team, taking full ownership of cloud infrastructure, Kubernetes platforms, DevOps tooling, and CI/CD pipelines.
Drive reliability, scalability, and security across the production environment while maintaining a sharp focus on developer velocity and business impact.
Mentor and guide engineers across SRE, DevOps, and Database Reliability functions, fostering a culture of operational excellence and pragmatic problem-solving.
Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs with an all-in-one B2B platform. They have raised $346 million, are expanding across key EU markets, and foster innovation, prioritizing research and solutions that benefit users, employees, partners, and the business.
Deploy, manage, and administer web services in public cloud environments.
Design and develop solutions for secure, highly available, performant, and scalable services in elastic environments.
Own all operational aspects of web services: automation, monitoring, alerting, reliability, and performance.
Jumio is the leading provider of online identity verification, eKYC, and AML solutions. With a global footprint, they are expanding to meet strong client demand across industries such as Financial Services, Travel, Sharing Economy, Fintech, Gaming, and more. We welcome applications from colleagues of all backgrounds and statuses.
Build and maintain CI/CD pipelines and GitOps workflows across a diverse set of engineering teams.
Own observability — monitoring, alerting, logging — and support development teams in instrumenting their services.
Optimise infrastructure for security, cost, performance and reliability.
1inch is a decentralized finance (DeFi) platform. We empower users to access the best rates and execute efficient and secure trades across multiple liquidity sources.
Build our observability and alerting platform from the ground up.
Lead infrastructure builds for compliance (SOC 2, HIPAA).
Truv is transforming the financial data industry with a secure and real-time API platform for payroll account access. Backed by $30M from top investors, they're disrupting a $2B legacy market with cutting-edge innovation and a customer-first approach.
Work directly with enterprise customers to deploy and configure OpenTelemetry instrumentation across their environments.
Build custom integrations, dashboards, and tooling to help customers realize the full value of Dash0.
Troubleshoot complex issues in distributed systems, Kubernetes clusters, and observability pipelines.
Dash0 is building an AI-centric platform that eliminates vendor lock-in and meaningless toil and is OpenTelemetry-native. They are backed by top-tier investors including Balderton Capital, Accel and Cherry Ventures and led by a founding team with decades of experience in observability.
Maximize the velocity of our product engineering team.
Ensure platform scalability, reliability, and security.
Champion best practices and shape the engineering culture.
They are building a robust, scalable trading platform to serve high-traffic, latency-sensitive applications. They leverage state-of-the-art technologies to support real-time trading while providing unparalleled reliability and performance.
Tech lead two teams (DevEx and Cloud Infrastructure) totaling 6–8 engineers: set technical direction, review key designs/changes, and raise engineering standards across both domains.
Own the delivery toolchain end-to-end (Git, CI, deployments/releases): reduce flakiness, improve build/test times, make releases repeatable with clear rollback, and drive adoption of org-wide standards through tooling, docs, and supported migrations.
Improve the software development lifecycle (setup → build/test → PR → deploy → observe) and standardize environments so teams spend less time on tooling and more time shipping.
Traackr is a global SaaS technology company providing a data-driven influencer marketing platform that marketers use to optimize investments, streamline campaigns, and scale programs. They are a remote-first company with offices in San Francisco, New York, Boston, Paris, and London and operate on a culture of mutual respect.
Leading infrastructure strategy and driving DevOps best practices across the engineering organization
Helping engineers build reliable products by improving infrastructure and application monitoring, alerting, and tooling
Building tools and frameworks that help developers better understand and debug their systems and data
Aspire provides influencer marketing software and services for social commerce. They have helped brands build and manage relationships with millions of influencers and are trusted by over 800 top brands.
Collaborate with application engineering teams on platform infrastructure.
Enhance observability and spearhead the adoption of SRE best practices.
Build and maintain reliable CI/CD pipelines, tooling, and infrastructure.
Rula strives to provide quality, evidence-based, compassionate mental healthcare and aims to create a world where mental health is no longer stigmatized. They are a remote-first company operating in most U.S. states, and are dedicated to having a culture of inclusion that supports their employees.
Own SLI/SLO/SLA definitions for the Akuity SaaS platform and drive continuous improvement.
Participate in an on-call rotation and act as incident commander for high-severity production events.
Partner with engineering teams to build reliability into new features before they ship to production
Akuity helps enterprises ship software faster and more reliably with modern GitOps best practices. The Akuity Platform enables teams to manage the development and deployment across hundreds – if not thousands – of Kubernetes clusters from a single control plane.
Design, build, and manage our cloud infrastructure using modern tools (Pulumi) to ensure all infrastructure changes are reproducible, secure, and easily auditable.
Orchestrate and optimize our Kubernetes clusters for complex, compute-heavy AI workloads, guaranteeing maximum efficiency and fault tolerance.
Implement a flawless monitoring setup using Datadog and OpenTelemetry to make the black box of our distributed systems transparent, hunting down latency spikes or bottlenecks before they impact users.
Deepslate is building Speech to Speech Voice AI models that sound and act indistinguishable from a human, with the belief that everyone should be able to use it. Backed by top-tier investors from the Tech and AI sectors, we are incredibly well-funded and moving fast.
Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
Focus on automation so we can spend energy where it matters.
Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.
Deliver outstanding support to open-source users and enterprise customers.
Develop deep technical expertise in GitOps, Argo, and Akuity’s offerings.
Troubleshoot, diagnose, and resolve customer issues promptly and accurately.
Akuity was founded by the co-creators of Argo to help organizations reliably deploy Argo at scale through enterprise support and enhanced capabilities. Backed by $25 million in funding, they are experiencing rapid growth while staying true to their open-source roots, and their culture is grounded in humility, authenticity, and diversity.
Build and operate cutting-edge cloud infrastructure to support Diagrid's core products
Define standards, deliver tools, processes, and frameworks to make our products secure, reliable, efficient, and highly available
Build and maintain CI/CD pipelines that enable delivering software quickly and securely across clouds
Diagrid believes that open-source software, open standards and APIs are the greatest transformational tools for organizations. They provide developers with APIs and tools that help them focus on their code and not on infrastructure and are founded by the creators of the Dapr and KEDA open-source projects.
Collaborate with product teams to implement cloud best practices.
Automate code changes, testing, and analysis using CI tools.
Jobgether is a platform that uses AI to match candidates with jobs. They ensure applications are reviewed quickly, objectively, and fairly against the role's core requirements.
Act as a trusted advisor for enterprise customers, guiding them through the design, validation, and deployment of vKS solutions on VCF.
Collaborate closely with field account teams to align solution architecture with customer priorities and drive successful engagement outcomes.
Provide prescriptive guidance on Kubernetes adoption, DevOps/CI-CD integration, and enterprise requirements such as scalability, security, and disaster recovery.
Broadcom is a global technology leader that designs, develops, and supplies a broad range of semiconductor and infrastructure software solutions. They are an equal opportunity employer and welcome qualified applicants from diverse backgrounds and locations outside the USA.
Propel builds technology that strengthens the social safety net. They are a passionate team of ~100 Propellers who envision a future where every American has the tools and resources they need to thrive, offering a remote-first working environment with headquarters in Brooklyn.