Similar Jobs
See allSenior Site Reliability Engineer
CertifyOS
US
GCP
Kubernetes
Terraform
Senior Site Reliability Engineer (Remote Build)
Remote
Global
Kubernetes
AWS
Terraform
Senior Site Reliability Engineer II - Infrastructure (AI Native)
Life360
Canada
Kubernetes
AWS
Python
Site Reliability Engineer (E3)
Vynca
US
AWS
Terraform
Kubernetes
Site Reliability Engineer (SRE)
Synthesia
US
AWS
Kubernetes
MongoDB
Site Reliability Engineering:
- Drive the definition and adoption of SLIs and SLOs across multiple services or entire platforms, ensuring alignment with business goals.
- Design and architect Infrastructure as Code (IaC) solutions for large-scale, complex environments, establishing standards and best practices.
Toil Reduction and Incident Management:
- Implement and refine comprehensive monitoring, alerting, and logging to detect and address performance and availability issues proactively.
- Lead the strategic effort to eliminate toil, identifying and championing major automation projects that deliver significant organizational efficiency.
Testing and Service Resiliency:
- Implement cloud security best practices, including identity and access management (IAM), encryption, and compliance controls.
- Proactively identify and address system weaknesses and ensure performance under stress.
Collaboration and Knowledge Sharing:
- Serve as a primary SRE liaison for development teams, influencing application architecture and design to meet reliability and scalability targets from inception.
- Create and maintain documentation for cloud architectures, deployment processes, and best practices.
Noctua Technology, LLC
Noctua Technology, LLC is a company that drives digital transformation by treating operations as a software engineering challenge, focusing on cloud native systems. They are a dynamic team seeking a Senior SRE to define strategy and bridge development and operations for clients.