The capstone observation across every L3 article in this series: most disasters are detected late because nobody set up monitoring. The domain expired silently because nothing was watching. The integration drifted for months because nothing was checking. The cloud bill crept up unnoticed because nothing was alerting. The backup quietly stopped working because nothing tested it.
The frustrating thing is that monitoring is the cheapest part of L3 work. The tools below are free or cost less than $50/month combined. Setup takes half a day. They catch maybe 80% of the L3 disasters that affect typical SMBs, before those disasters become billable crises.
This article is the minimum monitoring stack every SMB without an IT team should have. It's organized by which disaster each tool prevents, with specific setup instructions.
Tier 1: monitoring you should have today (free or near-free)
These four are non-negotiable. They're free or under $5/month each, and they prevent the highest-frequency disasters.
1. Uptime monitoring (catches: site down)
Tool: UptimeRobot (free tier covers 50 monitors with 5-minute checks).
What it does: pings your website every 5 minutes from multiple locations. If your site doesn't respond in 30 seconds, it sends you an email and SMS.
What it prevents: discovering your site is down via a customer email three hours later.
Setup time: 15 minutes. Add monitors for:
- Your main website (homepage)
- Your application's login page
- Your most-visited subdomain
- Any APIs you depend on
- Your status page if you have one
The free tier is enough for most SMBs. The paid tier ($7/month) gives you 1-minute checks and SMS alerts, which is worth it if downtime costs you real money.
2. Domain and SSL expiry monitoring (catches: certificate expired, domain expired)
Tool: Let's Monitor (free), or built-in registrar alerts.
What it does: alerts you 30, 14, 7, and 1 days before any domain or SSL certificate expires.
What it prevents: the certificate expiring on a Friday night and customers seeing browser warnings until Monday morning.
Setup time: 10 minutes. Add monitoring for:
- Your main domain and any business-critical subdomains
- Your wildcard certificates if you have them
- Domains that are technically yours but rarely accessed (legacy redirects, parked domains)
Most registrars also have built-in expiry alerts, but they often go to the email of whoever originally registered the domain — which may no longer be in use. Set up Let's Monitor as a backup that goes to a current business email.
3. Email deliverability monitoring (catches: emails going to spam, domain reputation damage)
Tools: MXToolbox Email Health (free spot-checks), and Postmark's DMARC Digests (free).
What it does: checks your domain's email authentication setup, blacklist status, DMARC reports.
What it prevents: silently degrading email deliverability from configuration drift or reputation damage.
Setup time: 30 minutes. Tasks:
- Run MXToolbox's "Email Health" report monthly. Bookmark it.
- If you have DMARC configured (you should — see the Email Deliverability article), set the
ruafield to a Postmark DMARC Digest address. They'll send you weekly readable reports.
If you find yourself relying on email for revenue (B2B services, e-commerce), upgrade to a paid DMARC reporting tool like dmarcian or Valimail. $20–100/month, much better visibility.
4. Cloud cost monitoring (catches: unexpected bill spikes)
Tools: AWS Budgets / Google Cloud Billing Alerts (built-in, free).
What it does: emails you when your cloud spending exceeds X for the month.
What it prevents: the surprise $4,000 bill from a misconfigured service running unchecked for three weeks.
Setup time: 15 minutes per cloud account. Tasks:
- Set a budget alert at 110% of your normal monthly spend (alerts when something's changed)
- Set a hard cap alert at 200% (alerts when something's badly wrong)
- Configure to email both you and a co-founder/partner so it's not on one person
If you're on AWS, also enable Cost Anomaly Detection (free, built-in). It uses ML to detect unusual spending patterns and alerts you proactively.
Tier 2: monitoring worth setting up this month ($0-$50/month)
These add visibility into more sophisticated failure modes. Each takes 30 minutes to set up and runs in the background forever.
5. Application performance monitoring (catches: slow responses, error rates)
Tools: Sentry (free tier generous), BetterStack (free tier limited).
What they do: monitor your application's error rate and performance from the inside. When errors spike, you get notified before customers complain.
What they prevent: silently broken features that affect a percentage of users.
Setup time: 1–2 hours (requires adding code to your application). Most modern frameworks have official integrations.
Sentry's free tier covers most SMB use cases — 5,000 errors/month, 10,000 transactions/month. Upgrade to paid ($26/month) if you exceed these.
6. Backup verification monitoring (catches: backups silently failing)
Tools: depends on your backup vendor; most have built-in monitoring.
What it does: alerts you when backups fail OR when backups haven't run in expected windows.
What it prevents: discovering during a disaster that your backups have been broken for three months.
Setup time: 30 minutes. Tasks:
- Enable backup failure alerts in every backup tool you use (database, file storage, SaaS data backups)
- Schedule a quarterly calendar reminder to do a test restore (covered in The Backup You Have But Probably Can't Restore)
The monitoring catches the failure mode where backups stop working. The quarterly test catches the failure mode where backups complete but aren't actually restorable. You need both.
7. DNS change monitoring (catches: unauthorized DNS changes, accidental misconfigurations)
Tool: Cloudflare's audit log (free if you use Cloudflare DNS), or DNSchecker periodic spot-checks.
What it does: tracks changes to your DNS records and alerts on changes you didn't authorize.
What it prevents: someone accidentally (or maliciously) changing critical DNS records and your business going offline.
Setup time: 15 minutes (Cloudflare specifically). Cloudflare's audit log records every change to your DNS settings with timestamp and user. Set up a weekly review or use a third-party tool to alert on changes.
8. Integration health monitoring (catches: integrations silently drifting)
Tools: custom dashboards, or vendor-specific tools, or just calendar reminders.
What it does: tracks success rate of your important integrations (Shopify→Quickbooks, CRM→email, etc.).
What it prevents: the almost-working integration drifting for months unnoticed.
Setup time: varies. Some vendors have built-in monitoring; others require custom work. The minimum: set monthly calendar reminders to check key integration metrics manually.
Tier 3: monitoring for businesses with serious data ($50-$200/month)
These are for SMBs whose business genuinely depends on technical reliability — e-commerce, SaaS, high-trust B2B services.
9. Synthetic transaction monitoring
Tools: BetterStack, Checkly, Pingdom.
What it does: runs scripted user journeys (login, search, add to cart, checkout) every few minutes from multiple locations. Alerts you if the journey fails.
What it prevents: a critical flow being broken for hours before someone notices.
Setup time: 2–4 hours per scripted journey. Cost: $20–100/month depending on number of monitors and frequency.
10. Real user monitoring (RUM)
Tools: Sentry Performance, Cloudflare Web Analytics, SpeedCurve.
What it does: measures actual page load times for real customers, segmented by location, device, browser.
What it prevents: discovering your site is slow for users in specific regions or on specific devices.
Setup time: 1 hour to install, ongoing review of dashboards. Cost: $0–$200/month depending on traffic.
11. Security monitoring
Tools: Cloudflare (free tier), Have I Been Pwned for credential monitoring, vendor-specific (Sentry alerts on security exceptions).
What it does: catches DDOS attempts, credential exposure, application security exceptions.
What it prevents: security incidents that compound while undetected.
Setup time: varies. Most modern hosting platforms include basic security monitoring. The Cloudflare free tier is exceptional value — if you're not on Cloudflare, get on Cloudflare.
How to actually use all this monitoring
The tools above will overwhelm you if you set them all up at once and try to read every alert. The discipline:
Tier 1 alerts go to email and SMS. They're business-critical. You drop everything to handle them.
Tier 2 alerts go to email only. Read them in your normal email rhythm. Most are informational; some require action.
Tier 3 alerts go to a dedicated Slack channel or email folder. Reviewed weekly. Pattern recognition over individual alerts.
Quarterly review of all monitoring: are the alerts still relevant? Have new failure modes emerged that aren't covered? Update accordingly.
Annual audit: kill monitoring that hasn't fired or been useful. Add monitoring for new systems. Documentation of what's monitored and why.
The cost-benefit of monitoring
Specific numbers for a typical SMB:
| Tier | Setup time | Monthly cost | Disasters caught | |---|---|---|---| | 1 (essentials) | 1.5 hours | $0–$15 | 70% of common L3 disasters | | 2 (ops visibility) | 4 hours | $0–$50 | additional 15% | | 3 (serious reliability) | 8 hours | $50–$200 | additional 10% |
Tier 1 alone is the highest-leverage use of an SMB owner's time. The time investment is once; the protection is continuous; the cost is essentially zero.
Tier 2 is appropriate once your business depends on the technology being reliable.
Tier 3 is appropriate for SMBs whose customers expect enterprise-grade reliability.
What about the failure modes monitoring doesn't catch
Some failures still slip through:
- Brand-new failure types that haven't happened before, so no monitor was set up for them
- Slow-moving problems like data corruption that accumulates over months
- Vendor-side problems that affect you indirectly (your CDN provider has a bad day)
- Compliance and regulatory issues that require audit, not monitoring
For these, the answer is quarterly reviews — not monitoring. Once a quarter, do a comprehensive review of: what happened that we didn't catch, what should we have known about earlier, what should we monitor going forward. The review catches the failure modes monitoring missed.
When to hire help
Monitoring setup is mostly self-service for SMBs willing to spend a half-day on it. When to bring in help:
- You have a complex environment (multiple services, multiple cloud providers, custom integrations) and you don't know what's important to monitor
- You've set up monitoring but it's noisy and you can't tell signal from noise
- You want monitoring that integrates with your specific tech stack (PHP, Laravel, Node, Rails, Python, etc.)
- You're handling regulated data and need monitoring that meets compliance requirements
A few hours of L3 help can dial in a monitoring stack that you'll then maintain yourself. The Lead Steer monthly retainer covers this kind of setup work.
What to do next
This is the last article in the L3 Tech series. The other articles cover specific recurring problems:
- Email Deliverability for SMBs
- Why Your AWS Bill Keeps Going Up
- The Domain and DNS Disasters Nobody Plans For
- The Backup You Have But Probably Can't Restore
- The Integration That's "Almost Working"
- Server Suddenly Slow? The Diagnostic Tree
- Migrating Without Downtime
If you've read the full series and want ongoing help with the L3 layer of your business, the Lead Steer monthly retainer is the most common arrangement — $500/month for 10 hours of mixed L3 / dev / EA work.
---
Part of the Level 3 Tech Support pillar guide.