Back to all jobs

- Employment
- Full-time
- Seniority
- Staff
About the role
What you will be doing:
- Act as technical lead for DevOps/Platform/Release engineering: set direction, standards, and best practices
- Architect and govern end-to-end delivery: infrastructure provisioning, configuration management, CI/CD, release processes, and operations
- Design and support Windows-based high availability solutions, with deep ownership of Windows clustering (failover/HA patterns, maintenance, upgrades, troubleshooting)
- Lead Linux automation and platform standardization (configuration, patching, hardening, performance tuning)
- Own Infrastructure as Code strategy with Terraform (modules, environments, state, governance)
- Own automation strategy with Ansible (reusable roles, inventories, secure secrets handling, idempotency)
- Build and standardize deployments using Octopus Deploy, GitHub, and Ansible (templates, shared steps, release promotion, rollback)
- Design and mature CI/CD pipelines (artifact versioning, approvals, promotion strategy, policy-as-code where applicable)
- Establish observability standards using VictoriaMetrics/Prometheus (metrics strategy, alerting, SLO/SLA monitoring, dashboards)
- Provide production leadership: incident response, RCA/postmortems, reliability improvements, capacity planning
- Mentor engineers, review designs/code, and raise overall engineering quality across teams
- Produce and maintain architecture docs, runbooks, and platform roadmaps
What you will bring to the role:
- Bachelors degree in Computer Science or related field
- 7+ years (or equivalent) in DevOps / SRE / Infrastructure Engineering, including leadership in complex environments
- Expert-level experience designing and operating Windows Server HA and clustering (Failover Clustering and related components)
- Strong Linux administration and automation experience (systemd, networking, storage, performance)
- Advanced skills with Terraform and Ansible (architecture, reusable components, secure operations)
- Strong deployment/release engineering experience with Octopus Deploy and GitHub (release governance, environment promotion, rollback)
- Monitoring/observability expertise with VictoriaMetrics and/or Prometheus (alerting strategy, metrics design, operational readiness)
- Production experience running Redis, RabbitMQ, Nginx (HA, tuning, troubleshooting)
- Strong understanding of networking and security fundamentals (TLS, DNS, load balancing, firewalling, least privilege)
- Proven ability to lead cross-team initiatives, make architectural decisions, and communicate clearly
- Kubernetes and container ecosystems (Docker, Helm)
- CI/CD platforms beyond GitHub (GitLab CI, Jenkins)
- Logging platforms (ELK/EFK, Loki)
- DR/BCP design, backup automation, zero-downtime upgrade strategies
- PowerShell and advanced scripting, configuration governance, secrets tooling (Vault/SOPS)
- Experience with Virtualization platforms such as VMWare or HyperV
- Experience in building, configuring, and tuning highly available MS SQL Server environments
- Experience in managing VOIP components and protocols (SIP , FreeSwitch, OpenSIP, session border controllers)
- Experience with load balancing components ( F5 LTM, F5 GTM)
- Experience with administering AWS or Azure tenants
Diversity, Inclusion, and Equal Opportunity
731,000+ hidden jobs like this
Intermedia Intelligent Communications and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites