Job Description

Monitor applications and infrastructure hosted on Google Cloud Platform (GCP) to ensure reliability and performance. Triage incoming support tickets and alerts, classifying them into Level 2 or Level 3 incidents. Analyze logs, system metrics, and conversation history to diagnose performance issues and system failures. Escalate critical or unresolved incidents to L3 engineers and collaborate with external vendors when necessary. Work closely with cross-functional teams including backend developers, AI/ML specialists, and cloud architects. Maintain clear documentation for incidents, resolutions, runbooks, and troubleshooting practices. Contribute to the evolution of knowledge bases, playbooks, and SOPs to improve team efficiency. Adhere to and support ITIL-based processes for incident, problem, and change management.

About Applaudo Studios

Applaudo Studios values trust, communication, respect, excellence and teamwork, and calls itself "the Best Digital team in the Region!"

Apply for This Position