AI Engineer

intandem

Worldwide$100k–135kRemote2d ago

Apply

Employment: Permanent Full Time

About the role

What you will accomplish:

Run the inference serving layer on our own GPU hardware: choose and tune the serving stack (vLLM, SGLang, TensorRT-LLM) for high throughput and low latency.
Optimize aggressively: tensor parallelism, quantization (FP8, AWQ, GPTQ), KV-cache and prefix caching, continuous batching, speculative decoding, concurrency tuning.
Serve multiple models and features off shared hardware: multi-LoRA, routing, and request scheduling that balances internal workloads against latency-sensitive product traffic.

Make our AI workloads efficient: improve latency, throughput, and GPU utilization so we get the most out of what we run.
Build the visibility: instrument performance and usage across our AI surfaces so there's clear data on how everything is running.
Surface the technical tradeoffs (performance, latency, efficiency) so the people making the calls have what they need to make them.

Ship the in-app agent layer that helps families coordinate: proactive nudges, smart suggestions, agents that summarize, draft, schedule, and act for busy parents.
Build the substrate underneath: tools, memory, orchestration, guardrails, and evaluation harnesses, integrated cleanly with production APIs alongside our architecture team.
Work in nimble pairs with feature owners, standing up whatever's needed to test an idea, including a vibe-coded UI when that's the fastest path to a real customer. Ship rough, learn fast, harden what works.

Who you are:

Technical and hands-on with infrastructure: you like running real systems on real hardware and keeping them fast and reliable.
A full-stack builder who wants the app layer too: you don't want to be boxed into infra. When a feature needs shipping, you want to pick it up and ship it, not just hand it off.
Performance-minded: you treat latency, throughput, and efficiency as things to engineer deliberately.
Rapid-prototyping and AI-first, with modern tooling (Claude Code, agent SDKs) part of your craft.
Motivated by work that matters. Families rely on these products during real moments in their lives.

What you bring:

5+ years shipping production software, including meaningful applied AI or ML work.
Demonstrated experience running and optimizing self-hosted LLMs on dedicated multi-GPU hardware: a serving stack (vLLM, SGLang, or TensorRT-LLM) and the optimization that comes with it (tensor parallelism, quantization, batching, KV cache).
A track record of optimizing inference performance and efficiency (latency, throughput, GPU utilization).
Strong Python and engineering fundamentals, with the full-stack range to stand up a quick UI, and the genuine desire to work app-layer features and not only infra.
Hands-on with agent frameworks (Claude Agent SDK, LangGraph, or similar), LLM APIs, embeddings, and RAG.
Comfortable with AWS and the devops this role owns: Docker, CI/CD, monitoring, and observability.
Experience building internal tooling or platforms others depend on. Bonus for Slack apps, MCP, or agent orchestration at team scale.

Medical: In Tandem pays 100% of the premium for employees AND 99% for all additional family members
401k: Up to a 4% match with immediate vesting
Paid leave for all new parents
Learning & Development stipend for employees
Paid Time Off: 11 Holidays + Winter Break (3 Days) + Volunteer Time Off (1 Day) + Floating Holiday (1 Day)
Personal Time Off: 15 days for 0-1 years of employment, 20 days 1-3 years of employment
Supportive and flexible working environment – work from anywhere!

Perks & benefits

401k
Paid Time Off

731,000+ hidden jobs like this

intandem and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

Unlimited applications — free stops at 5
Track every application in one place
Apply straight to the source, one click
Save & organize roles you love
Roles pulled from company boards before the big sites

Weekly

$9.99

$4.99/week

For an active search. Cancel anytime.

Get Weekly

Monthly

$24.99

$12.99/month

The smart pick. Save 35% vs weekly.

Get Monthly

Lifetime

$99

$49.99once

Pay once. Every future feature, forever.

Get Lifetime