Member of Technical Staff - RL Infrastructure

vmax

San Francisco1mo ago

Seniority: Staff

About the role

<h2><strong>About <em>V<sub>max</sub></em></strong></h2> <p><em>V<sub>max</sub></em> is an applied research lab developing AI capable of open-ended learning. We are building systems to exceed humans in all capacities by optimising beyond the local maxima of learning from human expertise.</p> <h2>About the role</h2> <p>This role is for strong infrastructure engineers who can build the systems layer for RL at scale: distributed rollouts, training orchestration, inference, evals, data pipelines, observability, and reliability. You will create the durable platform that enables researchers and applied ML engineers to run, debug, and reproduce large-scale RL experiments.</p> <h2 data-section-id="r8dte7" data-start="6952" data-end="6971">Responsibilities</h2> <ul data-start="6973" data-end="8359"> <li data-section-id="18ncivk" data-start="7232" data-end="7351">Build infrastructure for distributed RL training and inference across thousands of GPUs</li> <li data-section-id="1vp75id" data-start="7634" data-end="7709">Improve the reliability, debuggability, and throughput of RL experiments.</li> <li data-section-id="nt3ocq" data-start="7710" data-end="7839">Build interfaces that allow researchers and applied ML engineers to launch, inspect, compare, and reproduce experiments easily.</li> <li data-section-id="1vop8r7" data-start="7969" data-end="8109">Own infrastructure projects end to end, from architecture and implementation through deployment, documentation, and long-term maintenance.</li> <li data-section-id="a20cdd" data-start="8110" data-end="8235">Identify and eliminate bottlenecks in training, rollout generation, eval execution, data movement, and cluster utilization.</li> <li data-section-id="1x233uz" data-start="8236" data-end="8359">Maintain engineering standards for RL infrastructure, including testing, observability, versioning, and reproducibility.</li> </ul> <h2 data-section-id="w1j6vz" data-start="1480" data-end="1503">Minimum Requirements</h2> <ul> <li data-section-id="1wgb066" data-start="8386" data-end="8463">Strong software engineering experience.</li> <li data-section-id="1g8xqwl" data-start="9427" data-end="9537">Experience building infrastructure for LLM inference and/or RL training. </li> <li data-section-id="1cgittm" data-start="9538" data-end="9644">Experience with GPU clusters, distributed training, model serving, or high-throughput inference systems.</li> <li data-section-id="5izihy" data-start="9645" data-end="9783">Familiarity with vLLM, SGLang and modern LLM-RL training frameworks</li> <li data-section-id="1bq5x1e" data-start="8823" data-end="8933">Strong understanding of system reliability, observability, testing, debugging, and performance optimization.</li> <li data-section-id="jg7eld" data-start="8934" data-end="9051">Ability to work closely with ML researchers and translate messy experimental workflows into durable infrastructure.</li> <li data-section-id="1te4trn" data-start="9052" data-end="9134">Experience building tools, platforms, or services used by other technical users.</li> <li data-section-id="1apf8lq" data-start="9135" data-end="9255">Strong judgment around technical tradeoffs: when to prototype, when to harden, when to simplify, and when to redesign.</li> <li data-section-id="vzu8t0" data-start="9256" data-end="9376">Clear written and verbal communication, especially around system design, operational risks, and engineering tradeoffs.</li> </ul> <h2 data-section-id="17hey2t" data-start="9410" data-end="9425">Nice to have</h2> <ul data-start="9427" data-end="10566"> <li data-section-id="1a9mbmz" data-start="9784" data-end="9847">Experience supporting research teams or fast-moving ML teams.</li> <li data-section-id="ibmzoi" data-start="10091" data-end="10203">Experience at a high engineering bar organization where reliability, ownership, and code quality were central.</li> <li data-section-id="ttqq3e" data-start="10204" data-end="10364">Evidence of strong independent technical work, such as open-source projects, infrastructure projects, competitions, or substantial systems built from scratch.</li> <li data-section-id="1t8b5pk" data-start="10462" data-end="10566">Experience reducing operational complexity in systems that had become brittle, slow, or hard to debug.</li> </ul> <h2><strong>Role specific location policy</strong></h2> <ul> <li>This role is based in our San Francisco office; for exceptional candidates we are willing to consider a hybrid arrangement</li> </ul> <h2>Compensation</h2> <p>The expected salary range for this position is $300,000 - $500,000 USD</p>

755,000+ hidden jobs like this

vmax and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

Unlimited applications — free stops at 5
Track every application in one place
Apply straight to the source, one click
Save & organize roles you love
Roles pulled from company boards before the big sites

Weekly

$9.99

$4.99/week

For an active search. Cancel anytime.

Get Weekly

Monthly

$24.99

$12.99/month

The smart pick. Save 35% vs weekly.

Get Monthly

Lifetime

$99

$49.99once

Pay once. Every future feature, forever.

Get Lifetime