Job Description
Architect, build, and operate real-time/batch ETL pipelines, agentic orchestration flows, and AI/ML endpoints for autonomous, multi-agent production systems. Contribute actively to team processes, documentation, and operational quality.
Build event-driven data workflows (Snowflake, S3, Kafka, EventBridge, Celery, AWS Batch), integrate with FactSet, Sharepoint, Proxy connectors, and expose agentic features (LangChain, LangGraph, LlamaIndex, Pinecone).
Develop, maintain, and monitor vector database (Pinecone) pipelines, LLM and ML endpoints, ensuring agent memory/state is managed for retrieval-augmented pipelines.
Automate schema/version change management, event contract validation, lineage tracking, and observability. Write, run, and document QA/test coverage for ETL, agentic triggers, and GenAI model events, with incident and postmortem participation. Collaborate in agile ceremonies: propose improvements, troubleshoot delivery bottlenecks, and share technical knowledge via docs and training sessions. Implement and monitor security, compliance, and resiliency standards at all stages of the data/model workflows. Help onboard new team members; mentor engineers in agentic and event-driven data engineering best practices.