Header Ads

Breaking News

Trouble with troubleshooting: Network-management tools are letting IT pros down


(Root-cause analysis is one of the features least supported by network troubleshooting tools, according to Enterprise Management Associates’ recent research Network Management Megatrends 2020, that surveyed 350 network-management professionals about these tools. This article by EMA Vice President of Research Networking Shamus McGillicuddy discusses this finding and recommends how management pros should evaluate this feature. A recording of a webinar covering the research is here.)

Troubleshooting is perhaps the most vital responsibility of a network operations team. When IT services are interrupted or degraded, engineers and admins race to diagnose and remediate the problem. Every minute counts, because transactions, employee productivity, and customer satisfaction all suffer while the network team is doing this work.

Given the stakes, network management tools must have well-defined workflows and technical functionality to support the troubleshooting process. Unfortunately, many tools are letting network managers down.

Root-cause analysis (RCA)is the critical aspect of network troubleshooting. Network engineers must form a theory of the problem and test that theory. Only after they have confirmed their theory of the problem can they move forward confidently with a solution.

Over the years, network managers have told EMA that RCA is one of the most time-consuming aspects of their job. Given that network-management tools are clearly failing to support this task, engineers and admins must perform complex calculations themselves. The tools often present dashboards with vast arrays of alerts and time-series graphs that show patterns and indicators of a possible problem, but no clear definition of the nature of the problem. As a result, IT pros have to infer the root cause by looking for patterns of cause and effect. This is no easy tasks, especially given that network managers said that 42.7% of the alerts produced by their tools are false alarms, not indicative of an actionable problem.

Problem isolation and identification is the other least supported troubleshooting task. Before network managers can theorize a root cause, they need to find the problem, so they spend their days looking at their tools, which display red and yellow alerts and charts that reveal mysterious spikes and dips and traffic and device metrics. Engineers have to sift through this information and figure out which data are tied to an actual problem. Trouble tickets may offer clues, but isolating the source of a problem is not easy.

Source Link

No comments