What is multi-region uptime monitoring?
The methods employed by various SaaS monitoring tools differ significantly. In particular, the way multi-region checks are conducted can fall into two distinct approaches:
1. Using a Second Region Only to Confirm Website Status Changes
This approach is popular due to its resource efficiency. A monitoring tool continuously checks your website from the same location. If downtime is detected, it's then confirmed from a second location. The problem arises when the primary server fails, or the second server, meant to confirm downtime, isn't available.
At Uptime Monitor, we desired a more robust solution, leading us to the second approach.
2. Simultaneously Checking a Website from Different Locations
This method requires more resources, as it involves 3-5 checks per minute instead of just one. Some SaaS monitoring tools charge extra to manage this load or compromise on check frequency - executing only one request per minute, each from a different location. Notification delays and inconsistencies in delays between checks can also be an issue with some tools.
The most fair implementation, in our observation, was by Google. However, their service comes at a high cost. For instance, setting up just five monitors with default settings can cost about $88 per month.
Smart Check Distribution with Uptime Monitor
In a situation where six regions are employed to check a website every minute, and two checks fail, you wouldn't want to wait for the remaining four checks to receive an alert. The optimal distribution would be to execute each check every 10 seconds. Given that detection itself and notification delivery takes approximately 2 seconds, you would receive your downtime alert within a range of 2 to 22 seconds. If you set a fail threshold to 2 and select 3 regions in Uptime Monitor, the mean time of detection is roughly 10-20 seconds. This is the algorithm Uptime Monitor employs.
Real-World Test Results
Here are the results of a real-world test using free accounts and default settings with the minimum checking intervals (except for Google Cloud, where only 3 locations were used). The numbers indicate how many requests my web server received:
- UptimeMonitor.io: 42 checks.
- Google Cloud Uptime Checks: 21 checks.
- Freshping.io: 6 checks.
- StatusCake.com: 2 checks.
- UptimeRobot.com: 1 check.
- Cronitor.io (User-Agent: Monibot): 1 check.
- MonitorUptime.io (User-Agent: unirest-php): 1 checks.
The raw logs can be found at https://gist.github.com/lazureykis/8bed02a462f42c7a97e23db4f04fd7f7