ITIL problem management is an IT service management (ITSM) process that can take your overall service levels from good to great. Why? Because it gets to the root cause of business-affecting IT issues, identifies solutions, and works with other teams – as required – to ensure those solutions are delivered quickly and safely.In today’s dynamic, and technology-dependent, business environment, having a problem management process is a must-have if you want your organization to consistently deliver high levels of both IT service availability and performance.
To help get you started, this blog offers up the first five of ten, simple and practical tips to help you run and maintain an effective problem management capability within your organization.
Tip 1: Ensure That Your Initial Scope is Realistic
Start with a limited scope (for problem management), and it helps to look for some “quick wins” – those one or two high-pain, high-visibility IT issues that have had an adverse business impact and use problem management to address those.
Monitor the progress with the quick wins in your early adoption days, and use the results to recognize and demonstrate the success and value of problem management to key business stakeholders.
When setting your scope, also consider who can raise/request problem records. Obviously, you don’t want to create bottlenecks, but having too many people creating problem records could lead to both duplication of effort and rework. As with your scope, start small and expand over time until you, and those around you, are comfortable with the process (for raising/requesting problem records).
Tip 2: Design an Easy-to-Use Problem Form
Problem management records focus on establishing the root cause and actions needed to prevent recurrence, so design your problem form such that it’s easy to capture the right information quickly and particularly when under pressure.
Potential things to include are:
- Description of the issue (both a high-level summary for senior management plus the details for support teams)
- The IT/business service affected
- Impact – and consider both the business and technical implications
- Related incident descriptions
- Related changes
- User profile
- Equipment details including category – hardware, software, networks, etc.
- Priority – preferably based on a similar impact and urgency matrix as the one used to drive your incident management process
- Details of all diagnostic or attempted recovery actions taken –what has been tried so far? Has anything worked, even if at least partially? Has anything made the situation worse?
- An attachment field capability – to capture any further information such as meeting minutes, service improvement plans, or device logs.
By structuring your form around root cause analysis, you’ll help to drive the right behaviors into your support teams – such that nothing is lost or forgotten, and investigations follow a structured and logical approach.
Tip 3: Know Your Environments
When running your problem management process/capability, having a great understanding of your IT and business environments is a must.
In particular, knowing what your IT infrastructure is means that you’re better positioned to understand potential causes of incidents. Your environment will be somewhat unique to your company, but a typical IT landscape will include the following:
● Hardware components
● Software components
● Network and voice components
● In-house services and applications
● Third-party supported services and applications
● Policies, procedures, and governance
● Security controls
And by getting a handle on what your business-as-usual (BAU) operations look like, you’re potentially able to get a jump start on the potential root causes for incidents and the associated problems.
Tip 4: Don’t Panic If You Don’t Have a Problem Management Toolset
If you don’t have a problem-management-enabling ITSM toolset, then it makes things more challenging, but not impossible.
Start out with a personal productivity tool. For instance, create a simple spreadsheet such that you can keep track of all your problems and where they’re at in terms of investigation and resolution.
Problem management spreadsheets should contain the following information:
● Unique reference
● Logged date and time
● Resolution date and time
● High-level description
● Service affected
● Support team currently investigating
● Related ticket details (typically major incidents, incidents, and changes)
● Root cause
● Permanent fix.
This can be useful for both operational management and management reporting.
It’s not the most polished of problem-management-enabling solutions, but it will give you a good start and is something that can be transitioned into a fit-for-purpose ITSM toolset at a later date.
Tip 5: Employ Workarounds When Appropriate
In the spirit of the quick wins mentioned earlier, use problem management to find temporary solutions – ITIL calls these “workarounds.”
There’s a need to realize that not every problem can be fixed quickly and/or permanently. Maybe the cost is too high, or the benefits don’t justify the effort needed.
So, can any of your problems be addressed by a temporary solution? Something that won’t fix things forever but will get the service and/or end users back up and running again. Common examples of workarounds include weekly reboots for that flakey server that falls over at the worst possible moment, directing the Finance department to a different printer during month end, or re-routing network traffic for a particular service or application.
So that’s our first five problem management tips. Please come back for the next five, when we’ll cover known errors, permanent resolutions, and getting proactive. What do you think so far? Please let us know in the comments.