Back to all jobs
Reward Gateway logo

Application Operations Problem Manager

Reward Gateway
Plovdiv€28k–33kHybrid1mo ago
Employment
Full-time

About the role

  • Annual Wellness Bonus 
  • Monthly Edenred Electronic Food Voucher
  • Udemy: Access for your professional development  
  • Flexible Holiday plan & other leave benefits
  • Book Benefit: Professional development books and an additional annual budget for fiction books of your choice 
  • Subsidised sports card and many other benefits!    

What You’ll be Doing:

  • Own the end-to-end Problem Management lifecycle in line with ITIL best practice: problem detection, logging, categorisation, prioritisation, investigation, resolution, and closure 
  • Maintain and govern the Problem Record backlog in Jira Service Management, ensuring all records are accurate, prioritised, and progressing toward resolution 
  • Define and enforce the standards for problem identification, including criteria for reactive problem management (post-incident) and proactive problem management (trend analysis and risk identification) 
  • Manage the Known Error Database (KEDB), ensuring it is current, accurate, and actively used by L1/L2 support teams to improve first-contact resolution 
  • Lead and facilitate structured RCA sessions following major and recurring incidents, using recognised methodologies (e.g. 5 Whys, Fishbone/Ishikawa, fault tree analysis) 
  • Produce high-quality Problem Records and RCA reports that clearly articulate the root cause, contributing factors, timeline, and recommended corrective/preventative actions 
  • Ensure RCA outputs translate into tracked, accountable action plans with clear owners, timelines, and success criteria 
  • Challenge superficial root cause findings and push for systemic, durable fixes rather than symptomatic workarounds 
  • Analyse incident, change, and event data to proactively identify trends, recurring issues, and systemic risks before they become major incidents 
  • Collaborate with Observability and Platform teams to use monitoring signals, error budgets, and SLO breach data as early-warning inputs to the problem management process 
  • Contribute to the shift-left support agenda by feeding problem findings into runbooks, playbooks, and operability improvements 
  • Communicate problem status, known errors, and risk exposure clearly to technical and non-technical stakeholders, including engineering leads and senior management 
  • Produce regular problem management reporting, including metrics such as: number of open problems by age/severity, incident recurrence rate, time to root cause, and percentage of problems with preventative actions closed on time 
  • Present insights and trends to the Director of Application Operations and wider PETO leadership to inform prioritisation decisions and continuous improvement initiatives 
  • Work closely with Incident Management to ensure seamless handoff from major incidents into the problem management process 
  • Partner with L2.5/L3 engineering teams to coordinate investigation effort, agree timelines, and remove blockers to root cause resolution 
  • Integrate problem management activity into the Service Catalogue and Jira Service Management workflows, ensuring service ownership and escalation paths are respected 
  • Contribute to Change Management processes by ensuring known problems and risks are visible to change approvers, reducing the risk of change-induced incidents 
  • Continuously assess and improve the Problem Management process itself, maturing capability over time and aligning with evolving ITIL and organisational standards 
  • Build and maintain problem management documentation, templates, and guidance to enable consistent, high-quality practice across the PETO organisation 
  • Support the development of L2 team capability in recognising and logging potential problems, contributing to the team's progression toward greater autonomy 

Experience and Skills You Need in this Role:

  • Solid, demonstrable experience in an ITIL-aligned Problem Management role, ideally within a fast-paced, product-led technology organisation 
  • Strong working knowledge of ITIL Problem Management practices (ITIL 4 Foundation certification or above preferred), including the distinction between reactive and proactive problem management and the role of the KEDB 
  • Hands-on experience facilitating RCA sessions using structured methodologies (5 Whys, Fishbone, fault tree analysis, etc.) and translating findings into actionable improvement plans 
  • Experience working with Jira Service Management or a comparable ITSM platform to manage problem records, workflows, and reporting 
  • Ability to analyse incident and operational data to identify trends and systemic issues, with experience using dashboards or reporting tools to communicate findings 
  • Strong written and verbal communication skills, with the ability to produce clear RCA reports and updates for both technical audiences and senior non-technical stakeholders 
  • Collaborative working style with experience engaging engineering, infrastructure, and operations teams in problem investigation and resolution 
  • Familiarity with Agile ways of working and the ability to integrate ITIL practices within a modern, product-centric engineering environment 
  • Experience with observability and monitoring tooling (e.g. Datadog, Grafana, PagerDuty) as inputs to proactive problem management 
  • Understanding of SLOs, error budgets, and their relationship to operational risk and problem prioritisation 
  • Experience contributing to or maintaining a knowledge base (e.g. Confluence), including runbooks and known error documentation 
  • Exposure to cloud-native application architectures and API-first platforms 
  • ITIL 4 Specialist or Practitioner certification in relevant practices (e.g. Problem Management, Incident Management) 
  • Experience with operational metrics and reporting frameworks, including DORA metrics or similar 

The Interview Process:

  • Screening call with Talent Acquisition Partner
  • First Stage Interview with the Director of Application Operations & the VP Platform Engineering 

731,000+ hidden jobs like this

Reward Gateway and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.