Reduce the signal to noise ratio for issues up to 75%

Originally published at: Reduce the signal to noise ratio for issues up to 75% | Indeni

IT operations are often experiencing alert fatigue as the influx of issues can be overwhelming. To alleviate this problem, we are introducing a new feature in 6.4.3 to reduce the amount of issues generated.

How does Indeni work today without the cool down feature?

Indeni knows what to look for out of the box. The metrics to analyze and thresholds that determine if an issue is taking place are written by our community of IT experts. When Indeni knowledge has identified an issue, the automation platform notifies the user via the dashboard and via email immediately. If the issue appears to be flapping, the system resolves the issue and immediately open a new issue for the same problem. This happens iteratively, and although Indeni auto-resolves issues without the administrator having to acknowledge it, the overall experience can still be very noisy.

How does Indeni work with the new cool down feature?

When Indeni identifies an issue, again we notify you immediately. If the issue is resolved, we will move it to a “cool down” phase instead of moving it to resolved.

During this cool down period, if the issue resolves and un-resolves again, we will not open a new issue thereby reducing the number of active issues. Effectively, Indeni now has a patience mechanism built into your system. Indeni will not resolve issues as quickly in case of multiple recurrences of the same issue are taking place. The benefit for the user is fewer emails, and fewer notifications inside the Indeni user interface.

How patient is Indeni?

The Indeni team analyzes historical data collected by Indeni Insight across our installed base. We plotted many data points and created histogram of cool-off intervals. From this exercise, we came up with a fixed period for the cooldown interval. Based on our initial estimate, we can reduce up to 75% of these noisy issues.

What’s next?

The Indeni team will continue to analyze issue data generated by customers in Indeni Insight to tune the cool down period for different types of rules. Customers that haven’t enabled Indeni Insight to ensure their feedback is incorporated, also feel free to provide feedback in Indeni Community.

What can you expect with this new feature?

When Indeni decides to wait before resolving the issue, you will be able to view these steps in the Notes section of the dashboard.

In this example, Indeni waited for a few hours before resolving the issue. During this time, Indeni will not create new issues for the same problem.

How does it work for issue with issue items?

The cool down period will apply to the issue as well as issue items.

For example:
An issue has two issue items, cpu-0 and cpu-1 utilization are high. cpu-1 utilization becomes low, cpu-1 issue item moves to cooldown. In this case, assuming the cooldown period is 2 hours. Two hours later, cpu-1 issue item moves to resolved state. At this point, the issue is still active with one issue item in resolved state and the other issue item remains active. Now cpu-0 utilization becomes low, cpu-0 moves to cooldown. Two hours later, we move the issue item to resolved state. In this example, the cooldown period lasted four hours.

Are you an existing customer? This new feature is enabled by default so there is nothing you need to enable. New to Indeni? Download Indeni to see cool down in action.