Understanding Alerts — Triage & Resolution

BEST PRACTICESGUIDEMONITORINGOPENFRAME

Phase 4 — Monitoring & Policies · Step 5

Section

June 19, 2026

Published

Vladislav Marchenko

Vladislav Marchenko

Head Of Marketing

Understanding Alerts — Triage & Resolution

Phase 4 — Monitoring & Policies · OpenFrame Onboarding

Read this first. OpenFrame doesn't yet have a single unified "alerts inbox" with formal open→acknowledged→closed states — a dedicated notification/alert center is on the roadmap. Today, "alerts" surface across a few places: the Monitoring compliance views, the Logs stream, and per-device Alert Configuration. This guide shows where to look and how to work a problem from trigger to resolved with what's in the product now.

The goal of monitoring is simple: know something's wrong before the client does. Here's how that signal reaches you and what to do with it.


What triggers an alert

  • A policy fails. A device stops matching a compliance check (encryption turned off, OS out of date) and flips to FAILING; the policy becomes non-compliant.
  • A device goes offline or overdue. Based on the thresholds in a device's Alert Configuration (see below).
  • A tool reports an event. Fleet, the agent, and other tools write events to the Logs stream.

Where alerts surface today

1. The Monitoring dashboard (your first stop).
On Monitoring → Policies, the Failed Policies tile tells you if anything is non-compliant right now. Open a failing policy and its Devices table shows exactly which machines are FAILING — that's your triage list.

2. The Logs stream.
Logs (left nav) is the running event feed — Log ID, Status (e.g. INFO), Tool (e.g. Fleet), Source, and Log Details. Use the column filters and Search for Logs to find events for a tool or status, and Refresh to pull the latest. This is where tool- and system-level events land.

3. Per-device Alert Configuration.
On a device's detail page, the Security tab has an Alert Configuration section: Email / Text / Dashboard alert toggles, an Alert Template, and offline/overdue thresholds. This is where you decide what counts as alert-worthy for that device and how you want to be told.


A triage workflow

When Failed Policies is non-zero, or a device looks unhealthy:

  1. Identify. Open the failing policy → note which devices are FAILING. Or start from the device's detail page if a specific machine is in question.
  2. Understand. Look at the policy's Query to see exactly what it checks, and the device's Security / Compliance tabs for context (encryption, patches, agent health).
  3. Check the agent. A surprising number of "failures" are really an offline or missing Fleet agent — confirm on the device's Agents tab before chasing the check itself.
  4. Fix. Remediate on the device — often the fastest path is Run Script from the device's "…" menu (Phase 3) for a repeatable fix, or Remote Control/Shell for hands-on work.
  5. Verify (resolution). After the agent's next check-in, the device should flip back to PASSING and the policy return to COMPLIANT. Refresh to confirm — that's your "closed."

Getting alerts in front of your team

Since there's no unified alert center yet, route signal to where your team already works:

  • Slack is the cleanest real-time path — connect it (Phase 8) and send alerts to a dedicated channel.
  • Per-device Alert Configuration lets you enable Email / Text / Dashboard notifications and set offline/overdue thresholds for the machines that matter most.

What's missing today (and coming)

So you know the edges: there's no single queue that tracks each alert through acknowledged/assigned/closed states, and no global severity-routing rules engine. Those are part of the planned notification/alert center. If a unified alerts workflow is important to you, it's worth a vote in the OpenMSP Slack community.


Quick checklist

  • Check Failed Policies on the Monitoring dashboard regularly
  • Open failing policies to find the specific FAILING devices
  • Rule out an offline/missing Fleet agent before deep-diving a failure
  • Remediate (Run Script / Remote) and verify the device returns to PASSING
  • Set per-device Alert Configuration thresholds and route alerts to Slack

What's next

That completes Phase 4 — Monitoring & Policies. Next is Phase 5 — Scripts & Automation, where you'll run and schedule the scripts you reach for during exactly this kind of remediation.


Based on OpenFrame v0.9.19. This area is actively evolving — a unified notification/alert center is on the roadmap, so re-check the console before treating any of the "missing" pieces as fixed.

Vladislav Marchenko

Head Of Marketing

Hi all! My name is Vlad and I’ve been brought on to head the marketing team at Flamingo. Thankfully, this isn’t the first time I will be building a marketing department from scratch, so the experience should come in handy. Now it’s time to dive into the world of MSPs and find myself in this new world.

More in Phase 4 — Monitoring & Policies

Related Content

Product Releases

Webinars

Case Studies

Blog Posts

Frequently Asked Questions

MSP AI Agents

Yes. In production MSP shops today, 10% to 25% of tickets close before a human opens them. Thread alone has processed 173 million tickets across 750-plus MSP partners at 96% triage accuracy, handing back 490,000-plus technician hours. Agents own the low-risk, high-volume work (password resets, MFA enrollment, known installs, onboarding and offboarding) and flag anything that touches production data or needs judgment for a human to take.
On a five-person desk, reported deployments show $78,000 to $130,000 in annual direct labor savings, roughly 30% fewer escalations, and 15% to 20% better SLA compliance. Broader MSP adoption data adds ticket handling time cut by 45% and five to 12 points of margin, all from reclaimed capacity rather than headcount cuts.

AI MSP

Start with a readiness assessment, not a tool purchase. Confirm your ticket history is clean and your RMM, PSA, and monitoring systems connect. Then pick one high-volume, low-risk workflow, usually ticket triage, and pilot it on internal tickets before any client sees it.
Automate high-volume, low-risk tasks first. Ticket triage and alert noise reduction top the list because they run constantly and a human still resolves the underlying issue. Save security approvals, billing changes, and client-facing actions for later, always with a human in the loop.

AI Safety

It can be, with governance. Keep a human in the loop on high-risk actions, log every automated step for audit, and choose platforms that keep your data yours with no vendor lock-in. Pilot on internal data first so you catch issues before client systems are involved.

AI for MSPs

Set a baseline before rollout, then track tickets closed per technician, mean time to resolution, percentage of tickets resolved with no human touch, technician hours reclaimed, and cost per ticket. AI-driven automation commonly cuts operational cost per ticket by 25 to 40%.

About OpenFrame

OpenFrame isn't built to plug into your stack. It replaces it. Instead of duct-taping a dozen tools together (RMM, MDM, SIEM, patching, remote access, each its own login and bill), we bundle it into one unified platform: RMM, MDM, monitoring, automation, remote access, patch management, security monitoring, and ticketing, plus built-in AI copilots. So "does it integrate with X?" usually means: you won't need X anymore.

Zabbix for MSPs

Yes. Zabbix is open source under GPLv2 with no license fee, no per-device pricing, and no paywalled features. You can monitor unlimited hosts at zero software cost. The real expense is the infrastructure to host it and the engineering time to configure and maintain it.

Log Aggregation

Yes. Self-hosted Loki is free and open source under the AGPLv3 license, so you pay only for the infrastructure you run it on. Grafana Cloud is the paid, managed option, starting at $0.45 per GB of logs ingested with 50 GB free each month.

MSP Password Manager

There is no single best. For most MSPs, Bitwarden balances low cost and no lock-in, 1Password offers the most polished multi-tenant console, and Keeper adds built-in privileged access. The right pick depends on your budget, client base, and need for PAM.