Back to all jobs
I

AI Engineer

intandem

Worldwide$100k–135kRemote2d ago
Employment
Permanent Full Time

About the role

What you will accomplish:

  • Run the inference serving layer on our own GPU hardware: choose and tune the serving stack (vLLM, SGLang, TensorRT-LLM) for high throughput and low latency. 
  • Optimize aggressively: tensor parallelism, quantization (FP8, AWQ, GPTQ), KV-cache and prefix caching, continuous batching, speculative decoding, concurrency tuning. 
  • Serve multiple models and features off shared hardware: multi-LoRA, routing, and request scheduling that balances internal workloads against latency-sensitive product traffic. 
  • Make our AI workloads efficient: improve latency, throughput, and GPU utilization so we get the most out of what we run. 
  • Build the visibility: instrument performance and usage across our AI surfaces so there's clear data on how everything is running. 
  • Surface the technical tradeoffs (performance, latency, efficiency) so the people making the calls have what they need to make them. 
  • Ship the in-app agent layer that helps families coordinate: proactive nudges, smart suggestions, agents that summarize, draft, schedule, and act for busy parents. 
  • Build the substrate underneath: tools, memory, orchestration, guardrails, and evaluation harnesses, integrated cleanly with production APIs alongside our architecture team. 
  • Work in nimble pairs with feature owners, standing up whatever's needed to test an idea, including a vibe-coded UI when that's the fastest path to a real customer. Ship rough, learn fast, harden what works. 

Who you are:

  • Technical and hands-on with infrastructure: you like running real systems on real hardware and keeping them fast and reliable. 
  • A full-stack builder who wants the app layer too: you don't want to be boxed into infra. When a feature needs shipping, you want to pick it up and ship it, not just hand it off. 
  • Performance-minded: you treat latency, throughput, and efficiency as things to engineer deliberately. 
  • Rapid-prototyping and AI-first, with modern tooling (Claude Code, agent SDKs) part of your craft. 
  • Motivated by work that matters. Families rely on these products during real moments in their lives. 

What you bring:

  • 5+ years shipping production software, including meaningful applied AI or ML work. 
  • Demonstrated experience running and optimizing self-hosted LLMs on dedicated multi-GPU hardware: a serving stack (vLLM, SGLang, or TensorRT-LLM) and the optimization that comes with it (tensor parallelism, quantization, batching, KV cache). 
  • A track record of optimizing inference performance and efficiency (latency, throughput, GPU utilization). 
  • Strong Python and engineering fundamentals, with the full-stack range to stand up a quick UI, and the genuine desire to work app-layer features and not only infra. 
  • Hands-on with agent frameworks (Claude Agent SDK, LangGraph, or similar), LLM APIs, embeddings, and RAG. 
  • Comfortable with AWS and the devops this role owns: Docker, CI/CD, monitoring, and observability. 
  • Experience building internal tooling or platforms others depend on. Bonus for Slack apps, MCP, or agent orchestration at team scale. 
  • Medical: In Tandem pays 100% of the premium for employees AND 99% for all additional family members 
  • 401k: Up to a 4% match with immediate vesting   
  • Paid leave for all new parents   
  • Learning & Development stipend for employees   
  • Paid Time Off: 11 Holidays + Winter Break (3 Days) + Volunteer Time Off (1 Day) + Floating Holiday (1 Day) 
  • Personal Time Off: 15 days for 0-1 years of employment, 20 days 1-3 years of employment
  • Supportive and flexible working environment – work from anywhere!   

Perks & benefits

  • 401k
  • Paid Time Off

731,000+ hidden jobs like this

intandem and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.