Source Job

Europe

  • Design, implement, and maintain SFT and RL post-training pipelines for multi-step coding agents.
  • Train and adapt LLMs for agent workflows, including planning, tool use, and multi-step interactions inside JetBrains IDEs.
  • Build and develop evaluation and simulation environments where coding agents can act, be measured, and compared on realistic developer tasks.

Python PyTorch Kubeflow Airflow

20 jobs similar to Research Engineer (Agentic Models)

Jobs ranked by similarity.

Europe

Design and prototype machine learning solutions to improve the software development workflow. Apply existing open-source models or train custom ones as needed. Build training and evaluation pipelines that support fast iteration and ensure reproducibility.

JetBrains strives to make the most effective developer tools on earth by automating routine checks and corrections, speeding up production, and freeing developers to grow.

$315,000–$340,000/yr
US

  • Design and build infrastructure that enables researchers to rapidly iterate on reward signals.
  • Develop systems for automated quality assessment of rewards, including detection of reward hacks and other pathologies.
  • Collaborate with researchers to translate science requirements into platform capabilities.

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems to be safe and beneficial for users and society.

Europe

Train and scale neural networks for processing source code. Develop new methods and improve existing ones for code generation, code editing, and agent-based workflows. Mentor colleagues on ML topics.

At JetBrains, code is our passion and since 2000, they’ve focused on reducing routine work so developers can spend more time building and shipping.

North America Canada

  • Lead domain-specific model optimization using PEFT (LoRA/QLoRA) and knowledge distillation to balance cost, latency, and reasoning capability.
  • Build next-gen Retrieval-Augmented Generation pipelines using hybrid search, cross-encoders, and self-correcting retrieval loops.
  • Design and deploy multi-agent systems using frameworks like LangGraph or CrewAI, enabling autonomous task planning and tool-use (Function Calling).

ServiceNow is a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Their intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work.

$150,000–$220,000/yr
US Unlimited PTO

  • Incorporating the best research work on agents and code generation into the OpenHands framework
  • Performing novel improvements in areas of interest to improve agent performance and efficiency
  • Running and implementing evaluations to ensure agent quality

OpenHands is building an open-source AI platform that empowers engineering teams to accelerate development, automate workflows, and integrate intelligent coding assistance into real-world software delivery. The company fosters a culture built on kindness, candor, autonomy, and learning.

$160,000–$190,000/yr

  • Design, implement, and deploy AI-powered features, including model training, fine-tuning, and prompt engineering workflows.
  • Translate product requirements into robust, production-ready AI solutions, working with Product Managers, Software Engineers, and Data Scientists.
  • Optimize models and infrastructure for scalability, latency, and cost efficiency, partnering with DevOps and MLOps to ensure reliable and maintainable AI pipelines.

Paper is reimagining how schools support students so that every learner can reach their full potential.

$149,000–$350,000/yr
US

  • Drive fundamental and applied research in AI.
  • Build cutting edge Generative AI models, using techniques like Supervised Finetuning (SFT), Reinforcement Learning (RL), prompt improvements and synthetic data generation
  • Collaborate closely with product managers and engineers to transform user feedback into requirements for AI systems.

Figma’s platform helps teams bring ideas to life—whether you're brainstorming, creating a prototype, translating designs into code, or iterating with AI.

$60,000–$90,000/yr
US

  • Formulate and execute small, high-leverage research projects aligned with our product roadmap.
  • Independently build and validate end-to-end prototypes.
  • Design and run experimental pipelines autonomously, including setting up research environments and defining evaluation metrics.

ZetaChain is building the first universal blockchain and AI platform that connects everything—Bitcoin, Ethereum, Solana, and more—while pioneering in the GenAI space. They are backed by top investors, live on mainnet, and building the future of blockchain and AI technology.

US Canada

  • Build scalable training pipelines and generating high-fidelity synthetic scenarios.
  • Design procedural simulation environments and create diverse long-tail edge cases.
  • Optimize RL systems to train robust foundational models.

At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future.

  • Design, develop, and maintain a robust platform to enable users to create and manage AI agents.
  • Integrate and work with multiple LLMs, ensuring seamless orchestration and scalability.
  • Develop and implement evaluation frameworks for testing AI agents in challenging and complex scenarios.

ClickUp is building the first truly converged AI workspace, unifying tasks, docs, chat, calendar, and enterprise search, all supercharged by context-driven AI.

  • Design and implement interfaces across the platform for compute orchestration and RL training.
  • Translate complex backend systems into intuitive, production-ready product experiences.
  • Build for technical audiences, including AI and general software engineers.

Prime Intellect makes frontier AI accessible to everyone and enables individuals/organizations to train models using their agentic training infrastructure.

  • Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows.
  • Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control.
  • Improve reliability, performance, and safety across existing Python codebases.

Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.

US

  • Draft detailed natural-language plans and code implementations for machine learning tasks.
  • Convert novel machine learning problems into agent-executable tasks for reinforcement learning environments.
  • Identify failure modes and apply golden patches to LLM-generated trajectories for machine learning tasks.

At Mercor, we’re building the talent engine that helps leading labs and research orgs move AI forward.

$340,000–$425,000/yr

  • Lead research efforts to improve how human preferences are specified and learned at scale.
  • Develop novel architectures and training methodologies for RLHF.
  • Research techniques to identify and mitigate reward hacking.

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems, to be safe and beneficial for users and society.

$80,000–$150,000/yr

  • Research, Document, Test, and Ideate: Explore the best ways to achieve our customers’ goals using LLMs and other AI tools.
  • Master Our Dialogue Platform: Become an expert, answer questions, and train others on prompting both within and outside of our platform.
  • Train Our AIs: Utilize prompting, knowledge-base creation, and fine-tuning to enhance our AI capabilities.

1mind is a platform that deploys multimodal Superhumans for revenue teams, combining a face, a voice, and a GTM brain. The company has a remote-first, fast-moving culture with ownership, autonomy, and impact from day one.

  • Build and design end-to-end training pipelines for AI models, covering data ingestion to inference.
  • Architect scalable inference systems using tools like vLLM, TensorRT-LLM, or DeepSpeed.
  • Shape early product direction by experimenting with new use cases and building AI-powered experiences.

A1 is a self-funded, independent AI group backed by BJAK, focused on building a new consumer AI product with global impact.

  • Build AI agents and tools that transform how developers write code and debug issues.
  • Architect and implement AI-powered tools such as code review assistants and automated test generators.
  • Collaborate with the Principal Engineer and product/design teams in a remote-first environment.

Docker makes app development easier so developers can focus on what matters.

$81,075–$89,751/yr
Europe 5w PTO

  • Build and evolve Rasa’s core Conversational AI engine, leveraging LLMs.
  • Lead architecture decisions and algorithm implementations.
  • Collaborate with engineers to build Rasa Pro, enabling developers to build, deploy, and maintain complex AI assistants.

Rasa is a leader in generative conversational AI, enabling enterprises to build and deliver next-level AI assistants.

US Europe

  • Design, develop, and deploy AI-driven applications to make our software more accessible.
  • Own the software from requirements development through deployment and maintenance.
  • Design, build, test, and deploy a scalable system architecture.

Epistemix empowers organizations to make smarter decisions by simulating real-world outcomes using synthetic populations.

$175,000–$200,000/yr
US

Lead AI and ML initiatives to design and implement production-grade machine learning systems and pipelines. Develop scalable infrastructure for model training, evaluation, and deployment, ensuring reliability and observability. Collaborate with cross-functional teams to drive innovation and efficiency.

Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.