Design, implement, and maintain frontend and graph-level compiler components using MLIR
Develop and optimize graph-level transformations such as operator fusion, constant folding, operator sinking, graph partitioning, and other performance-critical optimizations
Extend and maintain MLIR dialects, passes, and infrastructure to support AI workloads
Evolve our kernel language to something that is usable both by developers inside and outside the compiler team and company
Design and implement backend compiler optimizations to efficiently map workloads onto heterogeneous architectures (CPU, NPU, and specialized accelerators)
Implement advanced optimization strategies across the compiler stack based on your experience, e.g.: Memory planning, tiling, vectorization, task partitioning, concurrency optimizations (compute and memory), etc.
Axelera AI is creating the next-generation AI platform to support anyone who wants to help advancing humanity and improve the world around us. The company has a world-class team of 220+ employees and is headquartered at the High Tech Campus in Eindhoven, Netherlands.
Design and deploy high-performance agentic systems that leverage Fastino’s optimized model architectures.
Collaborate with engineering teams to turn novel architectural breakthroughs into scalable solutions for enterprise customers.
Drive rapid, iterative prototyping of AI functionalities, refining model performance and task-accuracy based on real-world telemetry.
Fastino is building the next generation of LLMs with a team of alumni from Google Research, Apple, Stanford, and Cambridge and has developed the GLiNER family of open source models. Fastino has raised $25M through seed round and is backed by leading investors including Microsoft, Khosla Ventures, and Insight Partners.
Contribute to the development of the Everywhere Inference platform, a Kubernetes-based solution.
Design and implement APIs and developer tools to simplify deployment, management, and monitoring of AI applications.
Focus on packaging and integrating new ML models into the platform, using Python and common ML frameworks.
Gcore provides infrastructure and software solutions for AI, cloud, network, and security. They power everything from real-time communication and streaming to enterprise AI and secure web applications, with over 550 professionals globally and partnerships with technology leaders.