Back to all jobs
O
Infrastructure Engineer
orcristtechnologies
WorldwideRemote1d ago
About the role
<h2>Infrastructure Engineer – Platform</h2>
<h3>Company</h3>
<p>Orcrist is building a next generation data intelligence platform using cutting-edge technologies. We’re handling petabyte-scale data with sub-second queries. Our product is a Kubernetes-based platform delivered as B2B SaaS or as a self-hosted on-prem solution, including air-gapped deployments. We enable customers across defense, law enforcement, and enterprise to turn mission-critical data into actionable intelligence. Our Platform team owns the infrastructure that powers every deployment, from the metal up.</p>
<h3>Role</h3>
<p>Kubernetes runs on something, and that something is yours. You’ll own the layer beneath our platform: bare-metal GPU servers, operating systems, networking, and storage across on-prem and fully air-gapped sites. You design, build, and operate GPU server fleets and the NVIDIA software stack, then partner with our SRE and ML teams to deliver fast, reliable on-prem inference. Some of this work is hands-on at customer sites, where you size, rack, and commission self-contained server environments that run with no internet uplink.</p>
<h3>What you'll do</h3>
<ul>
<li>Design, size, provision, and operate bare-metal GPU server fleets across on-prem and air-gapped environments (firmware/BIOS, BMC via Redfish/IPMI, OS, drivers) with zero-touch provisioning (PXE/iPXE, MAAS/Metal3/Tinkerbell) and automation (Ansible/Salt, Terraform/Pulumi).</li>
<li>Own the NVIDIA GPU stack end to end: drivers, CUDA, GPU Operator, Container Toolkit, MIG, and DCGM, tuned for inference throughput, latency, and utilization.</li>
<li>Build the bare-metal substrate Kubernetes runs on: node lifecycle, container runtime, GPU device plugins, node feature discovery, and kernel/NUMA tuning.</li>
<li>Engineer data-center networking and resilient storage (VLANs/switching, RDMA, Ceph/ZFS/NVMe) sized to scale without replacing the core, with encryption at rest.</li>
<li>Partner with ML and MLOps on on-prem inference serving (Triton, KServe, vLLM): model deployment, GPU scheduling and sharing, and performance tuning.</li>
<li>Plan and run on-site build-outs: rack integration, power/UPS and cooling sizing, commissioning, capacity planning, runbooks, and operator handover.</li>
</ul>
<h3>About You</h3>
<ul>
<li>5+ years in bare-metal, HPC/GPU, data-center, or systems infrastructure engineering, with hands-on ownership of physical and compute infrastructure.</li>
<li>Strong bare-metal Linux (RHEL/Rocky/Ubuntu): firmware, BMC, PXE, kernel and storage tuning, plus solid networking and storage fundamentals.</li>
<li>Real experience with the NVIDIA GPU stack (drivers, CUDA, GPU Operator, MIG, DCGM) and serving GPU models in production.</li>
<li>Comfortable operating in air-gapped or on-prem environments and traveling to customer sites for builds and deployments.</li>
<li>Documentation-focused, methodical, and calm during hardware incidents. Eligible to work in Germany.</li>
</ul>
<h3>Nice‑to‑haves</h3>
<ul>
<li>German language (B1+), NVIDIA DGX/HGX or Slurm experience, InfiniBand/RDMA fabrics, and inference optimization (TensorRT-LLM, vLLM, quantization).</li>
<li>Certifications such as NVIDIA NCP-AIO, Red Hat RHCSA/RHCE, or CKA/CKS.</li>
<li>Field-engineering experience and familiarity with secure or regulated deployment environments.</li>
</ul>
<h3>What We Offer</h3>
<ul>
<li>Modern architecture & stack.</li>
<li><strong>Remote‑first</strong> in Germany with occasional team events in Berlin.</li>
<li>Home office budget and great equipment.</li>
<li><strong>30 days vacation.</strong></li>
<li>Direct impact on critical missions across private and public‑sector customers.</li>
</ul>
Perks & benefits
- Home Office Budget
741,000+ hidden jobs like this
orcristtechnologies and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites