As a Senior DevOps Engineer, you will define cloud infrastructure patterns and practices for containerized deployments and services. You will be responsible for the high-level design and architecture of Cloud and DevOps practices, implementing cloud components, automation, and application updates. Working with the Cloud Infrastructure Team, you will collaborate with Platform, Security, Data, and Software teams.
Job listings
Architect, build, and operate end-to-end ML pipelines for training, validation and deployment on Google Cloud. Define, instrument, and maintain logging, monitoring, and alerting for model performance and data drift. Automate CI/CD for ML artifacts and infrastructure using GitHub Actions or equivalent. Collaborate with cross-functional teams, including frontend engineers, backend engineers, research engineers, and infrastructure engineers.
As a Team Lead, you’ll be responsible for leading a team of site reliability engineers that are designing, deploying, and operating large-scale distributed systems across compute, storage, networking, and AI/ML environments. You will act as the primary technical escalation point, oversee day-to-day operational delivery, mentor and coach team members, and ensure adherence to SLAs and quality standards.
Lead the design, deployment, and optimization of scalable machine learning pipelines, focusing on Generative AI and large language models (LLMs). Collaborate across teams to streamline workflows, ensure system reliability, and integrate the latest MLOps tools and practices. Build intelligent, data-driven systems that deliver powerful PR insights. Push the boundaries of AI-powered insights and automation.
Participate in cloud operations and building solutions to migrate from on-premises to Azure, understanding the intricacies of both environments. Oversee the configuration, deployment, and management of the environments, ensuring availability. Implement and improve infrastructure as code (IAC), and other automation processes to streamline the migration and ongoing management of the environments.
Halcyon is seeking an experienced Agent Build/DevOps Engineer to develop and manage automation surrounding the development, building, and deployment of endpoint software. This role supports the delivery of endpoint protection capabilities to defend customers from ransomware threats by streamlining build and deployment workflows.
Shape and build the architecture of the platform for the entire organization and collaborate with several of the client's portfolio companies. You will spend approximately 30% of your time creating and automating infrastructure and processes in pipelines. We are looking for senior people with extensive technical knowledge.
As a Site Reliability Engineer, you'll be an integral member of product teams, helping to build, deploy, and monitor cloud services reliably, actively developing code and build frameworks to monitor services deployed in production. You will be responsible for ensuring the reliability, availability, and performance of our Elasticsearch infrastructure.
As a Senior Engineer, you will be a key individual contributor within our Production Operations team, instrumental in designing, building, and maintaining highly reliable, scalable, and performant cloud infrastructure and systems that support Greenlight's mission-critical services. This role is for a seasoned engineer who thrives on solving complex operational challenges, enhancing system stability, and improving efficiency through automation and best practices.
You will be part of a central team that develops and operates the system for digitization of construction sites for fiber-rollout in Germany. Your tasks will be to build and maintain scalable, reliable, and secure environment on AWS using Infrastructure as Code (IaC) tools, oversee the deployment, configuration, and management of clusters, and design and manage CI/CD pipelines.