Job Description
First 30 days:
- Understand the core customer use cases and data flows; propose a technical architecture and implementation plan for an agent evaluation framework.
- Immerse yourself in the current system design and agent/tooling landscape.
- Stand up the foundational eval and observability scaffolding (datasets, metrics, KPIs, reporting).
First 60 days:
- Deliver the MVP evaluation harness to produce initial metrics, enable debugging and perform regression testing.
- Take on a system feature that offers demonstrated improvement against your MVP evaluation suite.
90 Days & Beyond:
- Own the roadmap for evolving the agent evaluation platform and lead deeper R&D threads.
- Productionize the evaluation and observability platform and make it the source of truth for quality and safety.
- Online/offline tracing, alerting, dashboards, evaluations and PR-gated regression suite.
About Flock Safety
Flock Safety is the leading safety technology platform, helping communities thrive by taking a proactive approach to crime prevention and security.