The Veeva RTSM team is expanding and is looking for a Technical Operations associate to help scale its world-class IRT/RTSM system; focusing on Systems Administration, Development Operations, Site Reliability Engineering, and Release Management; to solve complex problems, working together as a team, sharing knowledge, and who enjoy implementing creative solutions to address business needs.
Job listings
As a Machine Learning Operations Engineer at Field AI, you will play a pivotal role in ensuring the scalability, efficiency, and reliability of our machine learning systems. You will manage and utilize data to optimize the performance of robots and drive innovation across industries, bridging the gap between machine learning models and production systems. This role offers the opportunity to work with cutting-edge technologies, solve complex problems, and contribute to the success of large-scale, real-time data systems.
Become a member of our Community Network! By submitting your resume, you will be a part of our pipeline and will be among the first ones to be considered once we have an opening. You will be considered for an Intermediate or Senior DevOps Engineer role.
This role is remote. We are looking for two key Lead AWS Cloud DevOps Engineers to join our team with experience in leadership and/or technical management who can lead a team, manage projects, delegate tasks and is very much technically hands-on. This key individual will be responsible for the infrastructure and services that empower our customers to run their businesses. Design and implement Cloud infrastructure for our mission critical cloud platforms using AWS technologies and best practices.
The Azure Cloud Engineer will focus on ensuring the reliability, performance, and availability of our Azure infrastructure while implementing SRE best practices; this role champions reliability engineering principles, drives automation initiatives, and builds robust observability solutions to maintain world-class uptime and performance.
Own the platform that powers our protocol and apps by designing and running AWS-first, highly available systems. Turn infrastructure into code using Kubernetes and CDK/Terraform and wire up deploys with GitHub Actions and CodePipeline. Build end-to-end observability to keep latency low and uptime high, while partnering with product and protocol teams to operate execution clients and RPC/DA/indexing workloads in production.
The Astronomer Customer Reliability Engineering (CRE) team is responsible for the success of our customers' usage of our managed Airflow service. As an infrastructure specialist within the team, you will learn to become an expert on the reliability of Kubernetes and the underlying cloud infrastructure. You will create strong relationships with customers and help them achieve their reliability goals.
The Cloud Reliability Engineer will write and integrate various open source and closed sources tools and will be responsible for configuration management, containerization, and scripting. Duties include developing, configuring, and deploying tools for cloud based systems and services, containerizing new and legacy applications, and providing LOE/scoping for projects.
Troubleshoot, maintain, and improve client production applications. Work with clients directly to understand and solve challenging problems that modify or write new code across multiple languages on multiple platforms. Write, modify, review and debug code. Participate in testing and quality control activities. Focus on incremental improvements and correcting production issues for existing applications.
Join our team as an Observability Engineer, handling monitoring and system reliability in a high-scale, complex environment for a large multinational food and beverage company. Transition into the SRE role, leveraging your software development background to improve incident detection and prevention. Collaborate with development teams to enhance service reliability and performance by investigating production issues and supporting software architecture.