Back to all jobs
I
Member of Technical Staff, Reinforcement Learning
Inception
Bay Area$200k–350kOn-site3mo ago
- Employment
- Full-time
- Seniority
- Staff
About the role
- Design, develop, and optimize RL training pipelines (PPO, DPO, RLHF, and novel approaches) for diffusion-based LLMs.
- Build and iterate on reward models, reward shaping strategies, and evaluation of reward quality.
- Implement innovative approaches for fine-tuning and scaling generative AI models.
- Work on data preprocessing pipelines, model evaluation, and alignment to enterprise use cases.
- Research and implement techniques for controlled text generation and constraint satisfaction.
- Improve training stability, efficiency, and reproducibility of RL workloads.
- BS/MS/PhD in Computer Science or a related field (or equivalent experience).
- At least 2 years of experience working on ML projects in PyTorch (or equivalent), preferably in a research lab or engineering role.
- Excellent familiarity with transformers and core LLM concepts (autoregressive pretraining, instruction tuning, in-context learning, KV caching).
- Hands-on experience with reinforcement learning from human feedback (RLHF), PPO, DPO, or related post-training methods.
- Familiarity with training and inference in diffusion models.
- Experience training deep learning models at scale in distributed computing environments.
- Extensive experience training transformer-based language models from scratch.
- Experience designing and implementing reward models or preference learning systems.
- Knowledge of advanced training techniques (mixed precision, gradient accumulation, etc.).
- Background in optimization theory and neural network architecture design.
- Experience with LLM serving frameworks like vLLM, SGLang, or TensorRT.
Compensation
- Work with World-Class Talent: Collaborate with the inventors of diffusion models and leading AI researchers
- Shape Foundational Technology: Your decisions will influence how the next generation of AI products are built and used
- Immediate Impact: Join at the ground floor where your contributions directly shape product direction and company trajectory
- Competitive salary and equity in a rapidly growing startup
- Flexible vacation and paid time off (PTO)
- Health, dental, and vision insurance
- 401k match
- Catered meals (breakfast, lunch, & dinner)
- Commuter subsidies
- A collaborative and inclusive culture
Perks & benefits
- 401k
- Vision Insurance
- Unlimited Vacation
- Paid Time Off
- Pension Matching
- Equity Compensation
764,000+ hidden jobs like this
Inception and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites