Back to all jobs
F

Senior DevOps Engineer, AI & Applications

Firmus Technologies

Melbourne2d ago
Seniority
Senior

About the role

<p><strong>Firmus Technologies</strong></p> <p><span data-contrast="none"><span data-ccp-parastyle="FIR Body" data-ccp-parastyle-defn="{&quot;ObjectId&quot;:&quot;baa2b379-a6d3-56f4-af2b-6a08de9434e5|1&quot;,&quot;ClassId&quot;:1073872969,&quot;Properties&quot;:[469777841,&quot;Aeonik&quot;,469777842,&quot;Aeonik&quot;,469777843,&quot;Aeonik&quot;,469777844,&quot;Aeonik&quot;,469769226,&quot;Aeonik&quot;,201342446,&quot;1&quot;,201342447,&quot;5&quot;,201342448,&quot;1&quot;,201342449,&quot;1&quot;,201341986,&quot;1&quot;,268442635,&quot;20&quot;,335551500,&quot;197122&quot;,335559740,&quot;264&quot;,201341983,&quot;0&quot;,335559738,&quot;145&quot;,469775450,&quot;FIR Body&quot;,201340122,&quot;2&quot;,134234082,&quot;true&quot;,134233614,&quot;true&quot;,469778129,&quot;FIRBody&quot;,335572020,&quot;1&quot;,469778324,&quot;Body Text&quot;]}">Firmus Technologies is a global&nbsp;</span><span data-ccp-parastyle="FIR Body">leader</span><span data-ccp-parastyle="FIR Body"> pioneering the development </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">and operation of efficient AI infrastructure across Asia Pacific.</span><span data-ccp-parastyle="FIR Body">&nbsp;</span></span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335559738&quot;:145,&quot;335559740&quot;:264}">&nbsp;</span></p> <p><span data-contrast="none"><span data-ccp-parastyle="FIR Body">Founded in Australia in 2019, our mission is to create the most efficient AI infrastructure by </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">combining&nbsp;</span><span data-ccp-parastyle="FIR Body">cutting-edge</span><span data-ccp-parastyle="FIR Body">&nbsp;technology with a steadfast commitment to sustainability.</span></span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335559738&quot;:145,&quot;335559740&quot;:264}">&nbsp;</span></p> <p><span data-contrast="none"><span data-ccp-parastyle="FIR Body">At Firmus, we are unique in our approach. We design, build, and&nbsp;</span><span data-ccp-parastyle="FIR Body">operate</span><span data-ccp-parastyle="FIR Body"> a new class of digital </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">infrastructure – the AI Factory. Through our model-to-grid technology approach, we have pushed </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">the boundaries of multi-generational liquid cooling systems, energy management, AI software </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">orchestration, and construction. For our customers, this approach allows us to make every watt </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">count and deliver low-cost AI tokens globally.</span></span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335559738&quot;:145,&quot;335559740&quot;:264}">&nbsp;</span></p> <p>&nbsp;</p> <p><strong>Firmus AI Cloud</strong></p> <p><span data-contrast="none"><span data-ccp-parastyle="FIR Body" data-ccp-parastyle-defn="{&quot;ObjectId&quot;:&quot;baa2b379-a6d3-56f4-af2b-6a08de9434e5|1&quot;,&quot;ClassId&quot;:1073872969,&quot;Properties&quot;:[469777841,&quot;Aeonik&quot;,469777842,&quot;Aeonik&quot;,469777843,&quot;Aeonik&quot;,469777844,&quot;Aeonik&quot;,469769226,&quot;Aeonik&quot;,201342446,&quot;1&quot;,201342447,&quot;5&quot;,201342448,&quot;1&quot;,201342449,&quot;1&quot;,201341986,&quot;1&quot;,268442635,&quot;20&quot;,335551500,&quot;197122&quot;,335559740,&quot;264&quot;,201341983,&quot;0&quot;,335559738,&quot;145&quot;,469775450,&quot;FIR Body&quot;,201340122,&quot;2&quot;,134234082,&quot;true&quot;,134233614,&quot;true&quot;,469778129,&quot;FIRBody&quot;,335572020,&quot;1&quot;,469778324,&quot;Body Text&quot;]}">Our large-scale GPU cloud platform, Firmus AI Cloud, is purpose-built </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">to deliver energy-efficient AI&nbsp;</span><span data-ccp-parastyle="FIR Body">compute</span><span data-ccp-parastyle="FIR Body">&nbsp;at scale to customers.</span></span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335559738&quot;:145,&quot;335559740&quot;:264}">&nbsp;</span></p> <p><span data-contrast="none"><span data-ccp-parastyle="FIR Body">It empowers developers, enterprises, educational institutions, and government users to train and </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">deploy AI models with unmatched efficiency and cost savings. With an ever-growing suite of services </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">and applications, we are committed to delivering a cloud experience that is market-leading, </span></span><span data-contrast="none"><span data-ccp-parastyle="FIR Body">proprietary, and built to scale.</span></span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335559738&quot;:145,&quot;335559740&quot;:264}">&nbsp;</span></p> <p>&nbsp;</p> <p><strong>Role Summary</strong></p> <p>Every AI feature we ship touches thousands of GPUs. The Senior DevOps Engineer will build the release engineering backbone—CI/CD pipelines, automated testing gates, one-click deployments with instant rollback—that lets Firmus scale fast and responsibly.&nbsp;</p> <p>You're the bridge between engineering and operations: setting Firmus standards for how code gets to production, mentoring the team on deployment safety, and driving a blameless culture when things go wrong. Ship safely. Ship often. Ship at scale.</p> <p><br><strong>Key Responsibilities</strong></p> <ul> <li>Design and maintain team-wide CI/CD pipelines (Jenkins, GitHub Actions, ArgoCD, or equivalent) with automated testing gates, artifact management, and deployments aligned with GPU cluster standards.</li> <li>Implement release engineering best practices: repeatable releases, GitOps workflows, automated rollback, and change management procedure.</li> <li>Build and manage test infrastructure: environment provisioning, data seeding, long-running job validation (especially for distributed training templates and multi-node job submissions).</li> <li>Establish engineering protocols and standards: repo organization, PR templates, code quality gates, dependency scanning, static analysis.</li> <li>Partner with infra teams to ensure AI product features deployment practices meet compliance and security standards for massive GPU clusters.</li> <li>Mentor team on testing strategies, deployment safety, and incident response procedures.</li> </ul> <p><br><strong>Skills &amp; Experience</strong></p> <ul> <li>5–7 years of CI/CD engineering, release engineering, or DevOps experience</li> <li>Deep expertise in GitHub Actions, GitLab CI, ArgoCD, or Jenkins with multi-stage pipeline design and testing gate implementation.</li> <li>Strong automation scripting (Python, Go, or Bash) for build orchestration and environment templating.</li> <li>Strong Kubernetes fundamentals (hands-on): deep understanding of Pod lifecycle and failure modes (Pending/Running/CrashLoopBackOff/Evicted), Deployments/ReplicaSets, Jobs/CronJobs, Services/Ingress, and how these primitives behave under load and during rollouts.</li> <li>Config &amp; secret management: practical experience designing and operating ConfigMaps and Secrets (including secret rotation patterns), with strong hygiene around least privilege, auditability, and preventing credential leakage into logs/artifacts.</li> <li>Safe rollout patterns: proven experience implementing and operating safe rollout strategies (rolling updates, canary, blue/green), readiness/liveness/startup probes, PodDisruptionBudgets, and rollback procedures—ensuring zero/low-downtime deployments for customer-facing services.</li> <li>Deployment safety &amp; debugging: ability to debug common Kubernetes rollout issues end-to-end (bad probes, misconfigured resources/limits, image pull failures, secret/config drift, node pressure/evictions) and convert learnings into automated CI/CD gates and runbooks.</li> <li>Familiarity with artifact management, versioning strategies, and rollback procedures.</li> <li>Experience integrating testing frameworks into CI pipelines (unit, integration, end-to-end).</li> </ul> <p><br><strong>Key Competencies</strong></p> <ul> <li>Engineering Velocity &amp; Time-to-Release improves quarter-over-quarter while release standards remain consistent (gates, tests, approvals, auditability).</li> <li>Platform Reliability &amp; Customer Trust remains strong: release-related incidents are rare and recovery is fast; reliability targets are met without "surprise outages."</li> <li>Developer Productivity &amp; Team Scale improves: engineers spend less time fighting CI/CD and more time shipping as the team grows.</li> <li>Cost Efficiency &amp; Resource Optimization improves: CI/CD and test infrastructure costs stay controlled (or decrease per unit of output) as usage scales.</li> <li>Knowledge &amp; Culture Multiplier effect is visible: release/reliability practices become the default across the org and repeat incident classes reduce</li> </ul> <p><br><strong>Success Metrics</strong></p> <ul> <li>Engineering Velocity &amp; Time-to-Release improves quarter-over-quarter while release standards remain consistent (gates, tests, approvals, auditability).</li> <li>Platform Reliability &amp; Customer Trust remains strong: release-related incidents are rare and recovery is fast; reliability targets are met without “surprise outages.”</li> <li>Developer Productivity &amp; Team Scale improves: engineers spend less time fighting CI/CD and more time shipping as the team grows.</li> <li>Cost Efficiency &amp; Resource Optimization improves: CI/CD and test infrastructure costs stay controlled (or decrease per unit of output) as usage scales.</li> <li>Knowledge &amp; Culture Multiplier effect is visible: release/reliability practices become the default across the org and repeat incident classes reduce</li> </ul> <p><br><strong>Location &amp; Reporting</strong></p> <ul> <li>Melbourne, Australia&nbsp;</li> <li>Reporting to Head of AI &amp; Applications</li> </ul> <p> <br><strong>Employment Basis</strong></p> <p>Full-time</p> <p><br><strong>Diversity</strong></p> <p>At Firmus, we are committed to building a diverse and inclusive workplace. We encourage applications from candidates of all backgrounds who are passionate about creating a more sustainable future through innovative engineering solutions.&nbsp;</p> <p>Join us in our mission to revolutionize the AI industry through sustainable practices and cutting-edge engineering. Apply now to be part of shaping the future of sustainable AI infrastructure.&nbsp;</p>

741,000+ hidden jobs like this

Firmus Technologies and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.