This role focuses on building intelligent systems that transform unstructured documents and structured system data into actionable knowledge, connected via an evolving knowledge graph. This position will develop and integrate named entity recognition (NER), topic modeling, correlation algorithms, and a recommendation system to link extracted insights across domainsβpowering intelligent applications for decision support, analytics, and automation.
The day-to-day activities of this position are working within the project team to:
Design and implement ML pipelines to extract entities, topics, and relationships from unstructured text (e.g., PDFs, reports) and structured data sources
Build scalable ingestion systems for integrating document-based and API-driven data streams into a unified context layer
Apply and fine-tune NER, topic modeling, and clustering techniques using modern frameworks (spaCy, HuggingFace, scikit-learn, etc.)
Correlate and link extracted data into a graph-based knowledge representation using platforms like Memgraph or Neo4j
Develop and deploy recommendation systems to suggest relevant content, actions, or knowledge graph entities based on user profiles, extracted insights, or contextual cues
Implement LLM-powered search capabilities that leverage embeddings, vector databases, and semantic understanding for intelligent querying across documents and graph data
Integrate ML outputs into full-stack applications built on React, Go, GraphQL, and PostgreSQL
Work with LangChain and LLM APIs (OpenAI, vLLM, Ollama) to enrich query capabilities and agent reasoning
Collaborate with infrastructure engineers to containerize and automate deployments via Docker and GitLab CI/CD