Back to all jobs
I

Member of Technical Staff, Reinforcement Learning

Inception

Bay Area$200k–350kOn-site3mo ago
Employment
Full-time
Seniority
Staff

About the role

  • Design, develop, and optimize RL training pipelines (PPO, DPO, RLHF, and novel approaches) for diffusion-based LLMs.
  • Build and iterate on reward models, reward shaping strategies, and evaluation of reward quality.
  • Implement innovative approaches for fine-tuning and scaling generative AI models.
  • Work on data preprocessing pipelines, model evaluation, and alignment to enterprise use cases.
  • Research and implement techniques for controlled text generation and constraint satisfaction.
  • Improve training stability, efficiency, and reproducibility of RL workloads.
  • BS/MS/PhD in Computer Science or a related field (or equivalent experience).
  • At least 2 years of experience working on ML projects in PyTorch (or equivalent), preferably in a research lab or engineering role.
  • Excellent familiarity with transformers and core LLM concepts (autoregressive pretraining, instruction tuning, in-context learning, KV caching).
  • Hands-on experience with reinforcement learning from human feedback (RLHF), PPO, DPO, or related post-training methods.
  • Familiarity with training and inference in diffusion models.
  • Experience training deep learning models at scale in distributed computing environments.
  • Extensive experience training transformer-based language models from scratch.
  • Experience designing and implementing reward models or preference learning systems.
  • Knowledge of advanced training techniques (mixed precision, gradient accumulation, etc.).
  • Background in optimization theory and neural network architecture design.
  • Experience with LLM serving frameworks like vLLM, SGLang, or TensorRT.

Compensation

  • Work with World-Class Talent: Collaborate with the inventors of diffusion models and leading AI researchers
  • Shape Foundational Technology: Your decisions will influence how the next generation of AI products are built and used
  • Immediate Impact: Join at the ground floor where your contributions directly shape product direction and company trajectory
  • Competitive salary and equity in a rapidly growing startup
  • Flexible vacation and paid time off (PTO)
  • Health, dental, and vision insurance
  • 401k match
  • Catered meals (breakfast, lunch, & dinner)
  • Commuter subsidies
  • A collaborative and inclusive culture

Perks & benefits

  • 401k
  • Vision Insurance
  • Unlimited Vacation
  • Paid Time Off
  • Pension Matching
  • Equity Compensation

764,000+ hidden jobs like this

Inception and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.