Job Description
Leads reliability and operations for ClickHouse’s Postgres integration — upgrades, patching, maintenance, and scaling. Designs and implements automation for provisioning, deployments, and service lifecycle management across AWS, GCP, and Azure by developing infrastructure-as-code using Terraform and modern CI/CD tooling to ensure consistent, repeatable deployments. Contributes Go-based tooling and services that improve automation, observability, and developer experience. Owns observability and monitoring , ensuring robust alerting, metrics, and tracing across environments. Drives incident management and postmortem practices that strengthen reliability and learning loops and collaborates cross-functionally with platform, networking, and product teams to improve service operability. Mentors and enables engineers , helping the team scale effectively as customer adoption grows.
About ClickHouse
Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies.