Implementing the improvements to the reliability, fault tolerance, scalability, and performance of our infrastructure
Managing incidents using your technical know-how to involve the appropriate teams and automate away manual practices
Improving observability across our systems (metrics, logs, tracing) to reduce time to detection and resolution
Newton is changing how Canadians trade crypto with the goal to make financial freedom achievable for everyone by giving their customers the tools and knowledge needed to navigate the crypto world. They are a remote team spread across Canada that values pushing boundaries and getting things done.
Provide production support on a shift according to the team on-call roster.
Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support.
Continuously monitor the health and performance of our services, systems, and infrastructure.
Granicus is driven by the excitement of building, implementing, and maintaining technology that is transforming the Govtech industry by bringing governments and its constituents together. They have served 5,500 federal, state, and local government agencies and more than 300 million citizen subscribers.
Design, build, and maintain scalable, highly available and fault-tolerant infrastructures.
Implement and improve monitoring, alerting, and incident response systems to ensure optimal system performance and minimize downtime.
Drive continuous improvement in infrastructure automation, deployment, and orchestration.
Mistral AI is dedicated to democratizing AI through high-performance, optimized, open-source models, products, and solutions designed to integrate seamlessly into daily working life. They are a dynamic, collaborative team passionate about AI and its potential to transform society dedicated to innovation.
Working with engineers across Yelp in supporting new features and services.
Integrating tools to monitor platform stability and performance.
Help scale our Kubernetes clusters and AWS-based infrastructure while maintaining our platform's SLOs.
Yelp's engineering culture values individual authenticity and encourages creative solutions. They focus on helping users, growing as engineers, and having fun in a collaborative environment.
Own SLI/SLO/SLA definitions for the Akuity SaaS platform and drive continuous improvement.
Participate in an on-call rotation and act as incident commander for high-severity production events.
Partner with engineering teams to build reliability into new features before they ship to production
Akuity helps enterprises ship software faster and more reliably with modern GitOps best practices. The Akuity Platform enables teams to manage the development and deployment across hundreds – if not thousands – of Kubernetes clusters from a single control plane.
Arista Networks is a data-driven, client-to-cloud networking company for large data center, campus, and routing environments. They have over $8 billion in revenue and value diversity of thought and perspectives, fostering an inclusive environment for creativity and innovation.
Write code, automate everything, design for reliability, and deeply understand the systems.
Build or extend Terraform modules and contribute to Platform Engineering around Observability.
Collaborate with developers to shape feature design so that reliability is built in, not added later.
InPost Group is an innovative European out of home deliveries company, revolutionizing the way parcels are delivered to customers. With over 10,000 employees worldwide, InPost Group is one of the largest out of home delivery providers in Europe, committed to providing sustainable and efficient delivery solutions.
Build and own the foundational infrastructure that our products run upon.
Work directly on our products' golang code base to implement SRE related objectives.
Take a data driven approach to quantifying system performance and reliability.
LiveKit provides the network infrastructure for multimodal AI interfaces, enabling seamless audio and visual interactions. Founded in 2021, LiveKit supports over 3 Billion calls annually, with 100,000+ developers and industry giants like OpenAI, Spotify, and Meta.
Building tools and applications to extends Calendly’s infrastructure platform
Evaluating and deploying cloud native open source tools
Exercising expertise in cloud infrastructure concepts and patterns
Calendly's product powers connections for millions through impactful innovation. They are in the midst of exciting growth and desire people that want to learn, grow, and do their best work.
Building tools and applications to extends Calendly’s infrastructure platform
Evaluating and deploying cloud native open source tools
Exercising expertise in cloud infrastructure concepts and patterns
Calendly makes it possible for their customers through impactful innovation. They have millions of users and are in the midst of exciting product growth.
Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation.
Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization.
Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner.
SSV Labs is the core team behind the SSV Network - pioneering decentralized infrastructure for Ethereum staking. They are building tools, protocols, and standards to make staking more secure, scalable, and trustless.
Collaborate with stakeholders to drive best practices for monitoring, CI/CD pipelines
Troubleshoot deployment issues in our CI pipeline
Identify areas for automation and embrace the codification of all things
Weedmaps is a global leader in the cannabis industry. They are dedicated to transparency, education, and community, serving cannabis to consumers and businesses in the U.S. and worldwide.
Manage and support infrastructure for Growth teams, including Nomad, Hashistack, databases, and any other underlying systems
Maintain and troubleshoot GitLab CI pipelines, ensuring reliable and fast build, test, and deployment cycles
Provide operational support across Onboarding, Acquire, and Engage teams, helping debug issues in staging and production environments
Kraken is a mission-focused company rooted in crypto values, aiming to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. As a fully remote company, they have Krakenites in 70+ countries who speak over 50 languages.
Develop and maintain observability solutions using platforms like Datadog, Prometheus and Grafana
Take a leading role in incident management, including coordinating response efforts, troubleshooting issues, and identifying follow-up actions
Partner with product engineering teams to architect reliable systems, recover from incidents, and learn from mistakes
Ditto is redefining how data moves at the edge, aiming to make resilient, real-time applications seamless for developers, regardless of network conditions. It's a globally distributed and fast-growing startup with over $145 million in funding that is committed to building a diverse and inclusive team.
Build the foundational, reusable services that every other JumpCloud product relies on to function securely and efficiently.
Deepen your expertise in Go, AWS, and Kubernetes while gaining broad architectural exposure by adapting to different teams and tech challenges.
Perfect for a versatile engineer who loves solving core infrastructure problems, building common frameworks, and thrives in a dynamic, flexible environment.
JumpCloud delivers a unified open directory platform that makes it easy to securely manage identities, devices, and access across your organization. With JumpCloud, IT teams and MSPs enable users to work securely from anywhere and manage their Windows, Apple, Linux, and Android devices from a single platform.
Lead and resolve technically deep Level 2 support cases from initial triage to full root cause analysis and final fix.
Diagnose issues across distributed, cloud-native systems, with emphasis on application and API behaviour.
Perform code-level debugging (Python, Go, or Java) to pinpoint application defects or misconfigurations.
Mambu is a leading SaaS cloud banking platform. They are on a mission to make banking better for a billion people and shape the future of financial services.
Deploy, manage, and secure Ivanti’s production Software-as-a-Service (SaaS) environments in AWS and Azure
Automate common and repetitive tasks
Participate in on-call rotations for 24x7 coverage (follow-the-sun model) for incident response, issue triage, and problem resolution
Ivanti's mission is to elevate human potential within organizations by managing, protecting and automating technology for continuous innovation. They are committed to building a diverse team and fostering an inclusive environment where everyone belongs.
Become a subject matter expert in applications supporting Ooma customers.
Collaborate with Development, QA and other SREs to evaluate, deploy, and debug applications.
Improve observability by implementing, refining, and adjusting application monitoring and thresholds.
Ooma empowers people to connect in smarter ways by creating powerful communication experiences through their cloud-based platform. They help small business owners stay connected, provide customized unified communications solutions, and offer smart home security solutions.
Work with other Engineering teams to design sustainable infrastructure and microservice solutions.
Automate tools and infrastructure to reduce manual work.
Monitor applications and participate in an on-call rotation as required.
Bloomreach is building the world’s premier agentic platform for personalization, revolutionizing how businesses connect with their customers by building and deploying AI agents to personalize the entire customer journey. They power personalization for more than 1,400 global brands.
Build Self-Service Infrastructure: Design and scale highly available Infrastructure as Code (IaC) modules using Terraform. Empower development teams to provision resources autonomously and securely.
Champion Platform Reliability: Partner closely with engineering teams to define, measure, and operationalize SRE metrics. Balance feature velocity with system stability.
Elevate Developer Experience (DevEx): Architect frictionless, GitOps-driven CI/CD pipelines utilizing GitHub Actions and ArgoCD. Facilitate automated, secure, and progressive deployments.
KTO Group drives excitement in iGaming through innovation, focusing on transparency and player satisfaction. Founded in 2018, KTO blends sports betting with online casino entertainment on a proprietary platform, and is a rising leader in LATAM, ranked among Brazil’s top 10 iGaming brands.