Principal SRE

Intermedia Intelligent Communications

WorldwideRemote1w ago

Apply

Employment: Full-time
Seniority: Staff

About the role

What you will be doing:

Partner with Engineering teams to design resilient services, architectures, and deployment patterns.
Define and promote SRE practices including SLIs, SLOs, error budgets, capacity planning, incident response, and post-incident learning.
Identify systemic reliability risks and work with teams to address root causes.
Help reduce operational toil through automation, tooling, and better engineering practices.

Work actively with Engineering teams during design, development, and production-readiness reviews.
Advise and challenge teams on service architecture, fault tolerance, scalability, observability, deployment safety, and operational readiness, helping them to make pragmatic trade-offs.
Support teams in diagnosing complex performance, latency, throughput, and resource-utilisation issues.
Help establish engineering standards and reusable patterns for reliable, maintainable services.

Lead investigations into performance bottlenecks across applications, infrastructure, databases, queues, networks, and third-party dependencies.
Improve observability through metrics, logs, traces, dashboards, alerting, and service-level indicators.
Help teams design meaningful alerts that identify user-impacting issues while reducing noise.
Drive capacity planning and load-testing practices for critical systems.

Build and improve automation, deployment tooling, infrastructure-as-code, monitoring, and reliability platforms.
Contribute to CI/CD improvements, release safety, rollback strategies, and progressive delivery practices.
Develop tools that help Engineering teams self-serve reliability, diagnostics, and operational insights.
Improve cloud, container, and orchestration environments with a focus on security, reliability, and scalability.

Participate in incident response for high-priority production issues.
Lead or contribute to blameless post-incident reviews.
Ensure actions from incidents result in improvements to architecture, tooling, monitoring, or process.
Mentor engineers on production ownership and operational best practices.

What you will bring to the role:

Experience in Site Reliability Engineering or senior backend/software engineering roles.
Software engineering background, with the ability to write clean, maintainable production code.
Experience working with Engineering teams to influence architecture and improve production readiness.
Understanding of distributed systems, scalability, resiliency patterns, failure modes, and performance engineering.
Experience diagnosing complex production issues across application and infrastructure layers.
Hands-on experience with cloud platforms such as AWS, Azure, or GCP.
Hands-on experience with on-premise environments and virtualization.
Experience with containers and orchestration technologies, Kubernetes is a must.
Knowledge of observability tooling, including metrics, logging, tracing, dashboards, and alerting.
Experience with infrastructure-as-code tools such as Terraform.
Experience with CI/CD pipelines and safe deployment practices.
Strong scripting or programming skills in languages such as Python, Go, Java, C#, JavaScript/TypeScript, or similar.
Clear and structured communication skills, with the ability to explain complex technical issues clearly to engineering and leadership audiences.

Diversity, Inclusion, and Equal Opportunity

741,000+ hidden jobs like this

Intermedia Intelligent Communications and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

Unlimited applications — free stops at 5
Track every application in one place
Apply straight to the source, one click
Save & organize roles you love
Roles pulled from company boards before the big sites

Weekly

$9.99

$4.99/week

For an active search. Cancel anytime.

Get Weekly

Monthly

$24.99

$12.99/month

The smart pick. Save 35% vs weekly.

Get Monthly

Lifetime

$99

$49.99once

Pay once. Every future feature, forever.

Get Lifetime