Last week, we published part one of our incident management tips blog, looking at the following steps in the incident management process: identification; logging; categorization; prioritization; and initial diagnosis (incident matching). Now in the second part of the blog we look at tips for incident: escalation; investigation and diagnosis; resolution and recovery; closure; and ownership, monitoring, tracking, and communications.
If an incident can’t be resolved at the first point of contact, then it needs to be escalated in order to restore service. There are two types of escalation in a typical service desk set up:
Functional escalation is where the next level of technical support and expertise is needed to resolve the incident. So, for example, a functional escalation could be from the service desk to second-line support or from second-line support to application support of network services.
Ensure that you have the priority-based roles and responsibilities for all support teams documented in your operational level agreements (OLAs) so that there’s no potential for confusion. Some examples of what to include are:
- Timescales for escalation
- Time scales for responding to an escalation
- Roles and responsibilities – who does what
- How to respond in the event of a major incident
Hierarchical escalations are about seniority and are typically invoked in the event of an end user or service owner complaint, or if the priority of the incident needs to be raised. Examples of hierarchical escalations include: from service desk analyst to team leader, from team leader to manager, or from manager to head of department, and so on.
Hierarchical escalations are needed if a manager with more authority needs to be consulted in order to take decisions that are beyond the competencies assigned to a certain level of staff. For example, to assign more resources in order to resolve a specific incident quickly, or to raise a purchase order for additional equipment. A predefined escalation hierarchy can save time in the event of a management escalation for example from:
- Service Desk Analyst to
- Service Desk Team Leader to
- Service Desk Manager to
- Head of IT to
Key tip: escalations will hopefully be a rarity in your IT organization but it doesn’t mean there’s no need to plan for then. Spend time understanding the possible escalation scenarios and who would need to do what when. These can, and should, be tweaked over time as needs, and personnel, change.
Investigation and Diagnosis
The reality is, investigation and diagnosis occurs during every stage of the incident lifecycle along with monitoring, updates, and communication. As soon as the incident is logged the service desk analyst begins triaging the call and collecting information. This may result in a first-time fix, or the call may be escalated to second-line support and beyond where investigation and diagnosis will continue until the issue has been fixed and normal service has been restored.
Documentation and support in the form of wikis, knowledge bases, or training sites can make a real difference to this stage of the incident. By better sharing knowledge, and top tips, you will improve the incident resolution rates from good to great.
Key tip: a key part of effective knowledge management is being able to see how a previous, similar, incident was resolved – the steps undertaken that worked and also those that didn’t. So, when resolving an issue for which there’s no formal knowledge article, ensure that you capture everything you tried for the benefit of the next person.
Resolution and recovery
AKA happy days, it’s all fixed! But before you start doing your victory dance, test and test again.
Key tip: it’s great that the incident is fixed from your perspective but you also need to check that things have been resolved for the end user or service owner. It’s why best practice suggests a second element of incident closure – customer confirmation.
This really comes down to two steps: customer confirmation and closing the incident record in your ITSM or service desk tool.
Where possible, contact the end user/customer to confirm that everything is now okay and that they’re happy for their incident to be closed. The second stage is to update the incident record with what happened and what you did to fix it before closing it off.
Remember what we said about knowledge sharing earlier? This is part of it. Simply writing “fixed” on an incident isn’t going to help the next person that comes along if the same issue returns. It doesn’t have to be War and Peace, just a quick overview of what the issue was and how you fixed it. Then you can consider it job done!
Key tip: ensure that service desk agents understand the importance of incident records throughout the incident lifecycle – from capturing the right end user and technology details, through being up-to-date with progress, to fully documenting the cause(s), fixes tried, and the ultimate resolution. Not doing so will cost the team dearly as incidents reappear over time.
In summary, incident management is one of the most important ITSM processes you’ll use because it affects everyone. And let’s face it, nigh on every company needs some form of IT support whether in-house or outsourced. IT support and incident management are also a highly-visible part of IT – you could argue that they are the public face of IT – so if your incident management process isn’t effective, you’ll adversely affect the business perceptions of IT as a whole. Don’t believe me? Industry analyst have proved it with their research. So, ensure that your incident management process works well, with issues captured consistently, service restored as soon as possible, and nothing lost, ignored, or forgotten about.
What are your top tips for incident management? Please let us know in the comments.