Senior Research Engineer (Agentic Behavior)

Amsterdam3d ago

Seniority: Senior

About the role

<p>At JetBrains, code is our passion. Ever since we started, back in 2000, we've been striving to make the strongest, most effective developer tools on earth. Today, AI-powered coding agents are becoming a core part of how developers write Kotlin – and we want to make sure they write it well.</p> <p>The Kotlin AI Value Stream team is responsible for how AI agents understand, generate, and improve Kotlin code across all platforms: Android, Kotlin Multiplatform, server-side, web, desktop, and others. We build the evaluation infrastructure, error analysis tools, and post-training pipelines that measure and improve agent behavior on real Kotlin developer tasks.</p> <p>As a Research Engineer on this team, you'll own the end-to-end loop: Analyze how agents fail on Kotlin → build evals that capture those failures → research and implement methods to fix them → measure the improvement. Your work will directly shape how millions of developers experience Kotlin through AI coding agents.</p> <h2><strong>As part of our team, you will:</strong></h2> <p><strong>Build tools for agentic error analysis</strong></p> <ul> <li>Design and implement tooling to systematically capture, classify, and analyse errors that AI coding agents make when generating Kotlin code.</li> <li>Build observability pipelines over agentic traces – mining patterns from agent sessions in JetBrains IDEs, Junie, Claude Code, Cursor, and other coding agents.</li> </ul> <p><strong>Build evaluation pipelines</strong></p> <ul> <li>Design, implement, and maintain evaluation pipelines that measure Kotlin code generation quality across dimensions, including correctness, idiomaticity, build success, framework usage, and test coverage.</li> <li>Build simulation environments where coding agents can be measured on realistic Kotlin developer tasks – from greenfield KMP projects and Gradle dependency management to migrating Spring applications from Java to Kotlin.</li> <li>Own evaluation infrastructure: metrics, experiment tracking, automated regression checks, and reproducible benchmarking.</li> </ul> <p><strong>Research methods for improving agent and model behavior on Kotlin</strong></p> <ul> <li>Experiment with post-training techniques (SFT, DPO, GRPO) to improve how models handle Kotlin-specific patterns, idioms, and frameworks.</li> <li>Investigate context engineering approaches: CLAUDE.md/AGENTS.md files, compiler-as-verifier feedback loops, Kotlin LSP integration, and MCP-based tooling.</li> <li>Run experiments to measure impact: A/B comparisons, benchmark suites, and before/after analyses on real codebases.</li> <li>Collaborate with model providers (Anthropic, OpenAI, and Google) to translate Kotlin-specific findings into model improvements.</li> </ul> <p><strong>Build public Kotlin benchmarks</strong></p> <ul> <li>Design and build open-source benchmarks that measure AI coding agent performance on Kotlin tasks and eventually become the standard reference for the ecosystem.</li> <li>Create task datasets covering the breadth of Kotlin usage: the server side (Spring, Ktor), multiplatform projects (KMP), build systems (Gradle), Android, library development, and others.</li> <li>Include both mined real-world tasks and carefully designed synthetic tasks that test specific Kotlin capabilities.</li> <li>Maintain and evolve benchmarks as models improve, ensuring they remain challenging, relevant, and contamination-resistant.</li> </ul> <h2><strong>We'll be happy to have you on board if you have:</strong></h2> <ul> <li>Hands-on experience building evaluation or analysis pipelines for LLMs or AI coding agents in a research or production setting.</li> <li>Strong Python engineering skills (at least three years), with the ability to write clean, maintainable code in data-heavy and ML-adjacent codebases.</li> <li>Experience with data analysis at scale: querying large datasets (SQL/Athena), building data pipelines, and performing statistical analysis of experimental results.</li> <li>The ability to own projects end to end – from identifying a problem in agent traces to designing an eval, running experiments, and shipping a fix.</li> <li>A product-aware mindset: You care about how agents are actually used by developers and can translate real failure modes into evaluation and training work.</li> <li>Familiarity with Kotlin or a strong willingness to develop deep Kotlin expertise (you'll be living in Kotlin codebases daily).</li> </ul> <h2><strong>Our ideal candidate would also have experience with:</strong></h2> <ul> <li>Post-training LLMs: SFT, RLHF, DPO, GRPO – either hands-on training or designing the data and reward pipelines that feed into training.</li> <li>Modern deep learning frameworks (PyTorch) and LLM training stacks (TRL, verl, Megatron, or similar).</li> <li>AI agent development: tool-using agents, multi-step coding workflows, agentic frameworks.</li> <li>Evaluation frameworks and tools: Inspect AI, Promptfoo, LM-evaluation-harness, or custom eval pipelines.</li> <li>Experiment tracking and observability: Weights & Biases, MLflow, Langfuse, or similar.</li> <li>The Kotlin ecosystem: Android, Gradle, KMP, Spring, Ktor – with an understanding of the developer workflows that agents need to support.</li> <li>Contributing to or maintaining open-source projects, especially benchmarks or evaluation tools.</li> </ul> <p>Don't check every box? That's okay – if you're excited about this work and bring strong fundamentals, we'd love to hear from you. We're happy to talk and provide the training you need to grow into the role.</p> <h2><strong>Why join JetBrains? </strong></h2> <ul> <li>Strong base salary. We offer competitive pay that reflects your skills and experience.</li> <li>Flexible work location. Enjoy the freedom to work from home or from the office.</li> <li>Remote work. Spend up to 30 days per year working remotely from abroad.</li> <li>Extra time off. More days to relax, recharge, and do the things you love.</li> <li>Medical insurance allowance. Enjoy peace of mind for you and your family</li> <li>Learning and development opportunities. Access to conferences, courses, and language classes.</li> <li>Relocation support. We help make your move as smooth and stress-free as possible. </li> <li>Language classes. Pick up the local language or sharpen your English skills.</li> <li>Fuel your day. Enjoy a hot meal or receive a lunch allowance on workdays.</li> <li>Mental health support. To help you feel your best, we provide easy access to professional mental health services.</li> <li>Sports benefit. Enjoy an on-site gym or sports club stipend.</li> <li>Internal events. Join company-wide celebrations and team gatherings.</li> </ul> <p>*Some benefits may vary depending on location.</p> <p><br><span style="color: rgb(255, 255, 255);">#LI-KP1</span></p><div class="content-conclusion"><p><strong>We are an equal opportunity employer</strong><br><br>We know great ideas can come from anyone, anywhere. That’s why we do our best to create an open and inclusive workplace – one that welcomes everyone regardless of their background, identity, religion, age, accessibility needs, or orientation.</p> <p><em data-stringify-type="italic">We process the data provided in your job application in accordance with the <a href="https://www.jetbrains.com/legal/docs/privacy/privacy-recruitment/">Recruitment Privacy Policy.</a></em></p></div>

Perks & benefits

Medical Insurance
Mental Wellness Budget

731,000+ hidden jobs like this

JetBrains and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

Unlimited applications — free stops at 5
Track every application in one place
Apply straight to the source, one click
Save & organize roles you love
Roles pulled from company boards before the big sites

Weekly

$9.99

$4.99/week

For an active search. Cancel anytime.

Get Weekly

Monthly

$24.99

$12.99/month

The smart pick. Save 35% vs weekly.

Get Monthly

Lifetime

$99

$49.99once

Pay once. Every future feature, forever.

Get Lifetime