Back to all jobs
Ensono logo

Manager of Monitoring Operations

Ensono
Chennai22h ago

About the role

<p></p> <p>Job Description: Manager – Monitoring Operations</p> <p>Role Summary</p> <p>The Manager – Monitoring Operations will lead and manage the enterprise monitoring operations team responsible for the availability, performance, and reliability of IT infrastructure and applications. This role will oversee the day-to-day operations of BMC Helix On-Premises Monitoring tool deployed on RedHat OCP (OpenShift Container Platform), Network and Device monitoring using ParkPlace Entuity, along with OS Monitoring using Prometheus-Grafana, ensuring a high service quality, operational excellence, and continuous improvement.</p> <p>The role requires strong people management skills, deep technical expertise in systems monitoring platforms, and experience operating monitoring solutions in containerized environments.</p> <p>Key Responsibilities</p> <p>· Lead, mentor, and manage a team of monitoring engineers/analysts, defining goals, KPIs, shift coverage, and on-call rotations.</p> <p>· Drive skill development through performance reviews, training initiatives, and continuous learning plans.</p> <p>· Act as escalation point for major monitoring incidents and outages, guiding quick workarounds to prevent monitoring gaps and loss of metrics.</p> <p>· Ensure operational excellence aligned with ITIL practices (Incident, Problem, Change) and adherence to security, compliance, and operational standards.</p> <p>· Manage upgrades, patches, capacity planning, and health checks across the monitoring estate to maintain high availability and performance.</p> <p>· Oversee the Server (Windows/Linux/AIX), Network, Database &amp; Synthetic URL Monitoring for the Enterprise and for the Global clients’ private cloud.</p> <p>· Collaborate with Container Platform, Core Infrastructure, and Network teams on platform stability, scaling, resilience, and resource allocation.</p> <p>· Optimize alert quality, reduce alert fatigue, standardize dashboards/alerting frameworks, and deliver actionable insights.</p> <p>· Maintain SOPs, runbooks, and operational documentation; provide regular reports on platform health, incidents, and SLA compliance.</p> <p>· Serve as the primary stakeholder contact for all monitoring services.</p> <p>· Conduct annual disaster-recovery (DR) tests for the monitoring estate to validate resilience, recovery procedures, and business continuity readiness.</p> <p>&nbsp;</p> <p>&nbsp;</p> <p>&nbsp;</p> <p>Required Experience &amp; Qualifications</p> <p>Experience</p> <p>· 10+ years of overall IT industry experience, including 5+ years in monitoring operations in medium-to-large organizations.</p> <p>· Hands-on operational expertise with at least two of the following monitoring platforms/tools:</p> <p>o BMC Helix Monitoring (SaaS or On-Prem)</p> <p>o RedHat OpenShift Container Platform (OCP) or Kubernetes Cluster Management</p> <p>o Prometheus, Exporters, OTEL Collectors, and Grafana</p> <p>o ParkPlace Entuity Network and Hardware Monitoring</p> <p>· Proven experience in monitoring architecture design, capacity planning, performance tuning, and integration with ITSM tools for automated ticketing workflows.</p> <p>· Strong knowledge of ITIL processes and operational best practices.</p> <p>Leadership &amp; Soft Skills</p> <p>· Strong people-management and leadership capabilities</p> <p>· Excellent communication and stakeholder-management skills</p> <p>· Ability to handle high-pressure situations and lead incident response</p> <p>· Strategic mindset with a focus on operational maturity and optimization</p> <p>Education &amp; Certifications</p> <p>· Bachelor’s degree in computer science, Information Technology, or equivalent</p> <p>· Relevant certifications (preferred, not mandatory):</p> <p>o RedHat OpenShift / Kubernetes</p> <p>o BMC Helix</p> <p>o Foundation certifications in ITIL and/or AI</p> <p>Nice-to-Have</p> <p>· Exposure to hybrid or multi-cloud environments</p> <p>· Experience in Automation, Scripting, APIs and AI-driven service improvements</p> <p>· Application Performance Monitoring (APM) experience</p> <p></p>

759,000+ hidden jobs like this

Ensono and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.