Back to all jobs
N

(Senior) Infrastructure Engineer

nscaleoperationsukltd

EMEA; London; UK3d ago
Seniority
Senior

About the role

<h2>About Nscale</h2> <p>Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.</p> <p>We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future.</p> <h2>About the Role</h2> <p>We’re hiring an <strong><strong class="textBold">Infrastructure Engineer</strong></strong> to design, implement, operate, and continuously improve the infrastructure platforms that support both internal and customer-facing services at Nscale.</p> <p>This role sits within the <strong><strong class="textBold">Operational Engineering team</strong></strong> in <strong><strong class="textBold">Engineering</strong></strong>, where you’ll work across the infrastructure stack below the hypervisor with a focus on <strong>Linux,</strong> <strong><strong class="textBold">OpenStack, storage systems, Proxmox, DNS, DHCP, and infrastructure automation</strong></strong>. You’ll collaborate closely with internal teams to ensure infrastructure meets performance, availability, and security requirements, while also serving as a <strong><strong class="textBold">3rd/4th line escalation point</strong></strong> for complex issues.</p> <p>Your work will directly support the reliability, scalability, automation, and security of the platforms that power Nscale’s GPU cloud. This is a high-impact role for someone who wants to shape core infrastructure, improve operational excellence, and bring deep technical expertise to both delivery and ongoing evolution of critical systems.</p> <h2>What you'll be doing</h2> <p><strong><strong class="textBold">Infrastructure Design &amp; Operations</strong></strong></p> <ul> <li value="2"><strong><strong class="textBold">Implement</strong></strong> infrastructure components that underpin internal and customer-facing services.</li> <li value="3"><strong><strong class="textBold">Operate</strong></strong> critical infrastructure layers below the hypervisor with a focus on stability and performance.</li> <li value="4"><strong><strong class="textBold">Maintain</strong></strong> essential services such as <strong><strong class="textBold">DNS, DHCP</strong></strong>, and configuration management tooling.</li> <li><strong><strong class="textBold">Design</strong></strong> scalable and resilient infrastructure platforms across <strong><strong class="textBold">OpenStack, Proxmox, Ceph</strong></strong>, and core supporting services.</li> </ul> <p><strong><strong class="textBold">Automation &amp; Continuous Improvement</strong></strong></p> <ul> <li value="1"><strong><strong class="textBold">Improve</strong></strong> automation for provisioning, monitoring, patching, and recovery.</li> <li value="2"><strong><strong class="textBold">Use</strong></strong> infrastructure-as-code and configuration management tools to standardise operations.</li> <li value="3"><strong><strong class="textBold">Drive</strong></strong> continuous improvement across infrastructure reliability, scalability, and operational efficiency.</li> <li value="4"><strong><strong class="textBold">Support</strong></strong> repeatable and maintainable platform operations through automation-first approaches.</li> </ul> <p><strong><strong class="textBold">Incident Management &amp; Escalation</strong></strong></p> <ul> <li value="1"><strong><strong class="textBold">Act</strong></strong> as a <strong><strong class="textBold">3rd/4th line escalation point</strong></strong> for complex infrastructure issues.</li> <li value="2"><strong><strong class="textBold">Partner</strong></strong> with support teams to resolve incidents and restore services effectively.</li> <li value="3"><strong><strong class="textBold">Investigate</strong></strong> root causes of infrastructure problems and contribute to long-term fixes.</li> <li value="4"><strong><strong class="textBold">Participate</strong></strong> in <strong><strong class="textBold">on-call rotations</strong></strong> and incident response activities for critical infrastructure.</li> </ul> <p><strong><strong class="textBold">Cross-Functional Collaboration &amp; Technical Guidance</strong></strong></p> <ul> <li value="1"><strong><strong class="textBold">Collaborate</strong></strong> with internal teams to ensure solutions meet <strong><strong class="textBold">performance, availability, and security</strong></strong> requirements.</li> <li value="2"><strong><strong class="textBold">Contribute</strong></strong> to infrastructure roadmap planning, including <strong><strong class="textBold">capacity management</strong></strong> and <strong><strong class="textBold">performance tuning</strong></strong>.</li> <li value="3"><strong><strong class="textBold">Introduce</strong></strong> new technologies that strengthen the infrastructure stack over time.</li> <li value="4"><strong><strong class="textBold">Provide</strong></strong> technical expertise to <strong><strong class="textBold">pre-sales</strong></strong> and other groups on infrastructure capabilities and best practices.</li> </ul> <p><strong><strong class="textBold">Standards, Security &amp; Compliance</strong></strong></p> <ul> <li value="1"><strong><strong class="textBold">Ensure</strong></strong> infrastructure platforms adhere to compliance, security, and operational standards.</li> <li value="2"><strong><strong class="textBold">Apply</strong></strong> best practices to the operation and evolution of infrastructure services.</li> <li value="3"><strong><strong class="textBold">Support</strong></strong> secure and well-governed platform delivery across the environments you own.</li> </ul> <h2>KPIs</h2> <ul> <li value="1"><strong><strong class="textBold">Infrastructure availability and resilience</strong></strong></li> <li value="2"><strong><strong class="textBold">Automation coverage for provisioning, patching, monitoring, and recovery</strong></strong></li> <li value="3"><strong><strong class="textBold">Complex incident resolution and root cause remediation</strong></strong></li> <li value="4"><strong><strong class="textBold">Capacity management and performance tuning effectiveness</strong></strong></li> </ul> <h2>About You</h2> <ul> <li value="1"><strong><strong class="textBold">Strong Python and Bash</strong></strong> skills</li> <li value="4"><strong><strong class="textBold">Strong troubleshooting experience</strong></strong> with <strong><strong class="textBold">Linux</strong></strong> and services running on Linux</li> <li value="5">Experience working with <strong><strong class="textBold">Ceph</strong></strong> and core infrastructure services</li> <li><strong><strong class="textBold">Nice to have experience</strong></strong> deploying, managing, upgrading, and operating large <strong><strong class="textBold">OpenStack</strong></strong> clusters</li> <li><strong><strong class="textBold">Experience</strong></strong> deploying, managing, and automating <strong><strong class="textBold">Proxmox</strong></strong></li> <li value="6">Knowledge of <strong><strong class="textBold">DNS, DHCP,</strong></strong> and configuration management in production environments</li> <li value="7">Ability to operate and improve infrastructure with a focus on <strong><strong class="textBold">availability, scalability, automation, and security</strong></strong></li> <li value="8">Experience handling complex infrastructure issues in an escalation capacity</li> <li value="9">Ability to work effectively with internal teams and provide technical input across the organisation</li> <li value="10"><strong><strong class="textBold">Nice to have knowledge of Ironic</strong></strong> and <strong><strong class="textBold">Neutron/OVN/OVS</strong></strong> is a plus</li> </ul> <h2>What we can offer you</h2> <p>At Nscale, you'll find a collaborative, supportive, and innovative environment where your contributions spark real impact. We're building something extraordinary, and we want you at the core.</p> <p>Highly competitive compensation package (base + bonus + equity), with performance reviews every 12 months. 🚀</p> <p>Join one of the fastest-growing AI infrastructure companies — your chance to directly shape how global AI capacity is planned and deployed. ✨</p> <p>Expect a dynamic progression plan tailored to your ambitions. Grow by leading critical cross-functional initiatives and shaping capital strategy — always with our full support.</p> <p>Human-First Flexibility: We treat you as humans first. 🫶🏽 Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.</p> <h2>Equal Opportunities Statement</h2> <p>We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.</p> <p>If there’s anything we can do to accommodate your specific situation, please let us know.</p> <p>The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.</p> <p>For information on how Nscale handles candidate personal data, please see our Employee &amp; Candidate Privacy Notice: Here.</p><div class="content-conclusion"><p><em>For information on how Nscale handles candidate personal data, please see our Employee &amp; Candidate Privacy Notice:&nbsp;<a href="https://drive.google.com/file/d/1QK5Yg04WHD9K9IAtJgQWubJZC9oLvatK/view?usp=sharing" target="_blank" data-saferedirecturl="https://www.google.com/url?q=https://drive.google.com/file/d/1QK5Yg04WHD9K9IAtJgQWubJZC9oLvatK/view?usp%3Dsharing&amp;source=gmail&amp;ust=1765375172804000&amp;usg=AOvVaw2Ncte4rmlGl8OKuFuDgDtx">Here.</a></em></p></div>

Perks & benefits

  • Equity Compensation

731,000+ hidden jobs like this

nscaleoperationsukltd and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.