Back to all jobs
M

Senior Software Engineer, Network Platform

moonlite

United StatesRemote4w ago
Seniority
Senior

About the role

<p>Moonlite delivers high-performance AI infrastructure for organizations running intensive computational research, large-scale model training, and demanding data processing workloads. We provide infrastructure deployed in our facilities or co-located in yours, delivering flexible on-demand or reserved compute that feels like an extension of your existing data center. Our team of AI infrastructure specialists combines bare-metal performance with cloud-native operational simplicity, enabling research teams and enterprises to deploy demanding AI workloads with enterprise-grade reliability and compliance.</p> <h2>Your Role:</h2> <p>You will be foundational to building our software-defined networking (SDN) platform that enables high-performance, isolated networking for distributed computing, model training, inference, and data-intensive workloads. Working closely with our network, infrastructure, and product teams, you’ll design and implement the network orchestration and provisioning systems that manage DPU-accelerated networking, tenant isolation, and network lifecycle management – enabling researchers and engineers to access enterprise-grade networking with cloud-like simplicity.</p> <h2>Job Responsibilities:</h2> <ul> <li><strong>Software-Defined Networking Architecture:</strong> Collaborate with infrastructure to design and build scalable SDN orchestration systems leveraging NVIDIA Bluefield-3 DPUs to deliver programmable, high-performance networking for AI workloads with hardware-accelerated forwarding isolation.</li> <li><strong>Research Cluster Networking: </strong>Design and implement networking systems for research computing environments including Kubernetes and SLURM clusters, enabling high-performance connectivity, optimized network topology for distributed workloads, and seamless integration with cluster orchestration systems.</li> <li><strong>Network Provisioning &amp; Lifecycle Management:&nbsp;</strong>Implement automated SDN provisioning systems that handle VPC creation, subnet allocation, routing configuration, and network resource lifecycle from deployment through decommissioning.</li> <li><strong>DPU Platform Engineering:</strong> Develop platform capabilities for managing Bluefield-3 DPUs including SR-IOV virtual function management, OVS offload configuration, network function deployment, and integration with compute orchestration systems.</li> <li><strong>Multi-Tenancy &amp; Network Isolation: </strong>Build enterprise-grade network isolation using VPCs, VXLAN, and hardware-accelerated forwarding to ensure complete tenant separation while maintaining high-performance connectivity for GPU clusters and distributed workloads.</li> <li><strong>High-Performance Networking:</strong> Collaborate with infrastructure to optimize network paths for RDMA, RoCE, and GPU-to-GPU communication, ensuring minimal latency and maximum throughput for distributed training and large-scale computational workloads.&nbsp;&nbsp;&nbsp;</li> <li><strong>Network APIs &amp; Integration:</strong> Develop robust APIs and SDKs for network resource management that integrate seamlessly with compute and storage platforms, enabling programmatic network provisioning and configuration.</li> <li><strong>Network Observability:</strong> Implement comprehensive network monitoring, telemetry, and troubleshooting systems that provide visibility into network performance, utilization, and tenant traffic patterns.Security &amp; Policy Management: Build platform network security features including security groups, firewall rules, and policy enforcement that protect tenant workloads while enabling flexible network configuration.</li> </ul> <h2>Requirements:</h2> <ul> <li><strong>Experience:</strong> 5+ years in software engineering with proven experience building network platforms, SDN systems, or network automation for production environments.</li> <li class="whitespace-normal break-words"><strong>Kubernetes Networking &amp; Container Orchestration:</strong> Strong familiarity with Kubernetes networking architecture, CNI plugins, service networking, and network policies. Understanding of pod networking, services, ingress, and how Kubernetes manages network resources.</li> <li><strong>Networking Expertise:</strong> Deep understanding of networking fundamentals including TCP/IP, VLANs, VXLAN, BGP, OSPF, routing protocols, and data center network architectures.Software-Defined Networking: Background in SDN concepts, network virtualization, overlay networks, and programmable networking technologies.</li> <li><strong>Programming Skills: </strong>Experience with Go and Python&nbsp;for performance-critical networking components and services is highly valued.</li> <li><strong>Linux Networking: </strong>Strong experience with Linux networking stack, including network namespaces, iptables/nftables, Open vSwitch, and kernel networking systems.</li> <li><strong>DPU &amp; SmartNIC Experience: </strong>Familiarity with DPU/SmartNIC architectures (Bluefield, or similar), SR-IOV, hardware offload capabilities, and programmable networking hardware – or strong ability to learn quickly.</li> <li><strong>High-Performance Networking:</strong> Understanding of RDMA, RoCE, Infiniband, and low-latency networking requirements for distributed computing and GPU workloads.</li> <li><strong>Problem-Solving &amp; Architecture: </strong>Demonstrated ability to solve complex networking performance and scalability challenges while balancing pragmatic shipping with good long-term architecture.</li> <li><strong>Autonomy &amp; Communication:</strong>&nbsp;Comfortable navigating ambiguity, defining requirements collaboratively, and communicating technical decisions through clear documentation.</li> <li><strong>Commitment to Growth:</strong> Growth mindset with continuous focus on learning and professional development.</li> </ul> <h2>Preferred Qualifications</h2> <ul> <li>Background provisioning or managing networking for research computing environments (Kubernetes, SLURM, or HPC clusters)</li> <li>Experience with NVIDIA Bluefield DPU programming and DOCA framework</li> <li>Background with network function virtualization (NFV) and service function chaining</li> <li>Knowledge of Kubernetes networking (CNI plugins, network policies, service mesh)</li> <li>Experience building network control planes or SDN controllers</li> <li>Familiarity with network automation frameworks and infrastructure-as-code for networking</li> <li>Understanding of data center fabric architectures (spine-leaf, CLOS topologies)</li> <li>Experience with network security and compliance requirements in regulated industries</li> <li>Background building networking for research institutions, HPC environments, or cloud providers</li> </ul> <h2>Key Technologies</h2> <ul> <li>Go, Python, NVIDIA Bluefield DPUs, Open vSwitch, VXLAN, SR-IOV, RDMA, RoCE, InfiniBand, BGP, Linux networking, Terraform, FastAPI, gRPC</li> </ul> <h2>Why Moonlite</h2> <ul> <li><strong>Build Next-Generation Infrastructure: </strong>Your work will create the platform foundation that enables financial institutions to harness AI capabilities previously impossible with traditional infrastructure.</li> <li><strong>Hands-On Ownership:</strong> As an early engineer, you’ll have end-to-end ownership of projects and the autonomy to influence our product and technology direction.</li> <li><strong>Shape Industry Standards:</strong> Contribute to defining how enterprise AI infrastructure should work for the most demanding regulated environments.</li> <li><strong>Collaborate with Experts:</strong> Work alongside seasoned engineers and industry professionals passionate about high-performance computing, innovation, and problem-solving.</li> <li><strong>Start-Up Agility with Industry Impact:</strong> Enjoy the dynamic, fast-paced environment of a startup while making an immediate impact in an evolving and critical technology space.</li> </ul> <p>We offer a competitive total compensation package combining a competitive base salary, startup equity, and industry-leading benefits. The total compensation range for this role is $165,000 – $225,000, which includes both base salary and equity. Actual compensation will be determined based on experience, skills, and market alignment. We provide generous benefits, including a 6% 401(k) match, fully covered health insurance premiums, and other comprehensive offerings to support your well-being and success as we grow together.</p> <p>#li-remote</p>

Perks & benefits

  • 401k
  • Medical Insurance
  • Pension Matching
  • Equity Compensation

731,000+ hidden jobs like this

moonlite and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.