Back to all jobs
L
Staff Platform Engineer, Manufacturing AI
Lutra
WorldwideRemote4mo ago
- Employment
- Full-time
- Seniority
- Staff
About the role
- Infrastructure & application ownership: Design and implement scalable infrastructure architectures across on-premise (edge) and cloud environments; evolve core infrastructure platforms that support production and pre-production workflows
- Pre-production environments & validation: Build and maintain sandbox, staging, and shadow-run environments that mirror production behavior; own how systems are provisioned, isolated, tested, and validated before rollout
- Replay-based testing & safe version rollouts: Design infrastructure to support A/B playback testing of models and software versions, offline and replay-based workload testing, and shadow-mode execution prior to version switching
- Reliability engineering, fault isolation & performance determinism: Define infrastructure standards that ensure reliable, isolated systems with predictable performance under real-world workloads
- Operability & cross-team collaboration: Partner with DevOps to ensure infrastructure designs are deployable, observable, and operable; collaborate with Edge and AI teams to enable safe experimentation
- Operating system: Linux
- Backend: Python (Flask, FastAPI), TypeScript/Node.js
- Orchestration & compute: Kubernetes, on-prem bare metal, VMs
- Containers: Docker
- Monitoring, observability & logging: Prometheus, Grafana, ELK
- Cloud providers: AWS, Azure, GCP
- Databases & storage: SQL, InfluxDB, MongoDB
- Messaging & IoT: MQTT, HTTP/REST, RabbitMQ, Apache Kafka
- Edge platforms: NVIDIA Jetson, Raspberry Pi (ARM)
- GPU/acceleration: CUDA, TensorRT, ONNX, OpenVINO
- ML/DL frameworks: PyTorch, TensorFlow, Keras, scikit-learn
- Scientific computing: NumPy, Pandas
- Computer vision: OpenCV
- Cameras & vision I/O: GenICam, GigE Vision, USB3 Vision
- Industrial automation: PLC integration; protocols: Ethernet/IP, Modbus, Profinet, OPC UA
- You have significant experience supporting the design and implementation of scaled production environments in hybrid (edge-cloud) or on-prem environments
- You have strong Linux systems knowledge and experience building and operating underlying compute platforms
- You have significant experience with infrastructure orchestration platforms (Kubernetes/K8s preferred) and/or virtualization platforms
- You are experienced with monitoring, observability and alerting stacks and best practices
- You have high comfort with, and understanding of, distributed systems and failure modes
- You have enough software engineering skills to be dangerous, and specific command of Python for infrastructure automation and validation tooling
- You have experience collaborating effectively within and across cross-functional delivery teams
- You are a contagiously curious person with entrenched learning habits
- You have experience designing and operating scaled production environments for manufacturing, robotics, IoT and/or industrial automation applications
- You have deep expertise in computer vision, robotics, or manufacturing automation
- You have experience supporting GPU-based or real-time workloads
- You are predisposed to mentorship and crafting a culture of continuous improvement
- You have experience scaling an AI and/or B2B SaaS venture
Compensation
764,000+ hidden jobs like this
Lutra and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites