It is very common for outsiders to the world of IT (or at least people who aren't familiar with the ITIL framework) to think of "incidents" and "problems" as synonyms. You can’t really blame them, as people tend to use these two terms interchangeably in everyday life. However, in the world of IT service management (ITSM), these two terms are inextricably linked but not the same.
Thus, understanding and correctly conceptualizing these two ideas is crucial for devising support plans, deploying solutions, and correctly assigning teams to deal with both of these occurrences. In this comparative piece, we’ll not only be looking at the key differences between incidents and problems but also how they should be managed in order to assure the best outcome possible.
What are Incidents?
According to ITIL, an incident is: “an unplanned interruption to an IT service or reduction in the quality of an IT service. Failure of a configuration item that has not yet affected service is also an incident – for example, failure of one disk from a mirror set.”
A good way of visualizing what incidents are like is to slightly deviate from IT examples and go for something that everyone has experienced: a power supply outage in your household. You’d be experiencing an incident in this case because though the root cause might be unknown to you, this nasty episode has specifically brought forth an equally specific outcome: you’re in the dark.
In simple terms, an incident is a specific occurrence that causes issues in an IT infrastructure. These are unplanned and tend to have a limited effect on a single user or service. Incidents are also the kind of situations that service desk support agents are often tasked to solve. They are disruptive by nature and ticketing systems such as InvGate Service Desk are usually built around fixing these types of events.
However, because of how specific they are, they might often have an underlying cause and if that cause is not known, workarounds rather than proper solutions are how these incidents tend to be dealt with.
What are Problems?
Again, let's turn to ITIL: “A cause of one or more incidents. The cause is not usually known at the time a problem record is created, and the problem management process is responsible for further investigation.”
Going back to our power supply outage analogy, while the incident would be that you’re in the dark due to a lack of electricity, the problem would be that one of the generators at your neighborhood’s power plant has overheated and caused the aforementioned outage.
This is to say: problems are the root cause that incidents stem from. These could happen for a myriad of reasons but contrary to incidents, they require other areas of IT to collaborate and articulate tasks in order to effectively solve them. Problems are also more far-reaching than incidents and because of that, they both require very different approaches.
What are these approaches, you ask? And what is an incident ticket vs. problem ticket in terms of what they cover? Let's take a look.
Incident Management has a very clear goal from the get-go: restore service operations ASAP as well as minimize the impact that said the incident has had on a service’s ensuing degradation. IT help desk agents troubleshoot individual tickets, they record, investigate and classify the incident as well. Other ways of ensuring good incident management include providing continuous development of problem and error control and having a tiered support structure that allows escalation
The great thing about recording and classifying these incidents is that it provides enough information for support teams to better address these potentially recurrent incidents in the future. Regardless of how many employees are assigned to an incident, it’s always good to remember which are the service desk metrics and key performance indicators (KPIs):
- The swiftness of the incident’s resolution
- Adequate prioritization of the incident
- Be conscious of the cost of the resolution Being conscious of the resolution’s cost
- Assessment of the users’ level of satisfaction during the process
- Measurement and tabulation of results with metrics such as First Contact Resolution, Cost Per Contact, and Customer Satisfaction
Speaking of performance, user lcaputo on the ITIL Community Forums shared a very interesting perspective and some advice from the point of view of a Problem Management team member:
“From my experience in an Italian public entity, I can tell that Problem Management really is something that needs a certain grade of maturity from the IT department, and a solid commitment from the business. Often problem tickets are forgotten because the Service Desk / Incident teams tend to focus on solving incidents that are felt more like pressing issues than a problem. There (at the workplace) the internal appointee did a weekly ping on the Problem Analysis team (inside the ITSM platform) and asked them to update the ticket with any new info they had, otherwise he asked them to do further research on the topic.
A concrete piece of advice would be to first consolidate the Incident and Service Request practices and then follow it up with defining a Problem policy, a solving team with some defined work time dedicated to doing the job, and an internal appointee that will monitor each ticket on a weekly/monthly basis.”
In addition, it’s recommended that teams have solid Incident Management software. Companies use InvGate Service Desk to help teams to address and document the incidents within the company’s IT infrastructure. It also allows companies to easily build an IT knowledge base that aids in spreading, scaling, and standardizing symptomatology. This increases both consistency and swiftness when dealing with incidents.
For more information, check out our Definitive Guide to Incident Management.
The goal of Problem Management is to minimize the adverse impact of incidents and problems caused by errors in the infrastructure and to prevent the recurrence of incidents related to those errors. The activities associated with Problem Management primarily deal with identifying why the incident occurred in the first place, and identifying and documenting known errors.
The main objective of Problem Management is to reduce the negative impact of both incidents and problems caused by errors in the infrastructure as well as prevent recurrences of incidents related to those errors. The primary activities associated with Problem Management are related to why the incident happened to begin with.
There certainly is some overlap between Incident Management and Problem Management, however, it is advised that they are handled by separate teams because of the nature of problems vs. incidents. Problem Management is more of an inquisitive area, while Incident Management requires a fast reaction from support teams.
Incident Management deals with an individual incident as quickly as possible, and Problem Management deals with why the incident (or multiple similar incidents) happened, seeking to either eliminate the root cause or build an effective, easily-deployable workaround. As seen throughout this article, it’s clear to see that understanding the difference between Incident and Problem Management is merely the first step towards a more solid IT infrastructure and solid teamwork is the quintessential pillar of such an endeavor.
Frequently asked questions
What’s the difference between Incidents and Problems?
The difference between Incidents and Problems is that Incidents are specific and usually single-time issues that require quick attention, while Problems are root causes that bring forth Incidents if left unattended.
How can one manage Problems vs. Incidents?
Problem management is better suited for help desk support teams who can act quickly to provide either a quick fix or a workaround, while Incident management requires a more in-depth analysis of the problem so as to prevent it from causing more incidents in the future.