Back to all jobs
Material Group logo

Inference Performance Engineer

Material Group
New YorkHybrid
Employment
Full-time

About the role

About the role

Serving frontier models at scale requires solving novel systems problems at every layer of the stack. As an Inference Performance Engineer, you'll own the runtime that turns accelerators into a production serving system, optimizing throughput, latency, and cost across thousands of nodes. You'll work alongside hardware and compiler teams operating at the frontier of AI silicon design.

What you'll do

  • Build and improve the inference runtime

  • Design scheduling, continuous batching, KV cache, and prefill/decode disaggregation

  • Implement low-precision kernels and speculative decoding

  • Drive throughput, latency, and cost per token

  • Collaborate with hardware teams on kernels, operators, and graph optimizations

  • Own the OpenAI-compatible API surface and serving protocol

  • Build benchmarking, profiling, and regression infrastructure

What you'll need

  • BS in CS, EE, or related field, or equivalent experience

  • Software engineering experience: Rust, Go, Python, or C++

  • Understanding of concurrency, memory, and tail latency

  • Understanding of modern inference: transformers, attention, KV cache, batching, speculative decoding, quantization

  • Experience with model serving frameworks: vLLM, TGI, SGLang, TensorRT-LLM, llama.cpp, or custom runtimes

  • GPU or ASIC programming experience: CUDA, ROCm, Triton, or vendor-native toolchains

  • Experience with low-precision inference (FP8, FP4, INT4)

  • Profiling and benchmarking experience: Nsight, perf, custom harnesses

What we offer

  • Top-tier compensation structured to recognize and retain the best talent

  • Meaningful equity

  • Comprehensive medical, dental, vision, life, and disability insurance

  • Parental leave for all new parents, including adoptive and surrogate journeys

  • Flexible PTO

  • Paid Holidays

  • Relocation support

Equal Employment Opportunity

We're an Equal Opportunity Employer and do not discriminate on the basis of any protected status under applicable law.

Perks & benefits

  • Unlimited Vacation
  • Paid Time Off
  • Equity Compensation

731,000+ hidden jobs like this

Material Group and thousands of companies post here first โ€” often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications โ€” free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.