
How I Monitor 77 Web Scrapers Without Going Crazy (My Exact Setup)
When you have 1 scraper, monitoring is easy. Check if it runs. Done.
When you have 77 scrapers running on different schedules, extracting data from sites that change their layout every Tuesday at 3am, monitoring becomes a full-time job.
Here is the system I built to stay sane.
The Problem
Web scrapers fail silently. They do not crash with an error. They just return empty data, or stale data, or wrong data. And you only notice when a client says: "Hey, the data looks off."
I needed a monitoring system that catches:
- Scrapers that return 0 results (site changed layout)
- Scrapers that return the same data twice (caching issue)
- Scrapers that take 10x longer than usual (being rate-limited)
- Scrapers that return data in the wrong format (schema changed)
Layer 1: Health Checks (5 minutes to set up)
Every scraper writes a heartbeat file after a successful run:
import json
from datetime import datetime
from pathlib import Path
def write_heartbeat(scraper_name, result_count, duration_seconds):
heartbeat = {
"scraper": scraper_name,
"timestamp": datetime.now().isoformat(),
"result_count": result_count,
"duration_seconds": round(duration_seconds, 2),
"status": "ok" if result_count > 0 else "empty"
}
Path("heartbeats").mkdir(exist_ok=True)
path = Path(f"heartbeats/{scraper_name}.json")
path.write_text(json.dumps(heartbeat, indent=2))
return heartbeat
Enter fullscreen mode Exit fullscreen mode
Then a simple checker runs every hour:
import json
from datetime import datetime, timedelta
from pathlib import Path
def check_all_scrapers(max_age_hours=24):
issues = []
cutoff = datetime.now() - timedelta(hours=max_age_hours)
for hb_file in Path("heartbeats").glob("*.json"):
data = json.loads(hb_file.read_text())
last_run = datetime.fromisoformat(data["timestamp"])
if last_run < cutoff:
issues.append(f"STALE: {data['scraper']} last ran {last_run}")
elif data["status"] == "empty":
issues.append(f"EMPTY: {data['scraper']} returned 0 results")
elif data["duration_seconds"] > 300:
issues.append(f"SLOW: {data['scraper']} took {data['duration_seconds']}s")
return issues
Enter fullscreen mode Exit fullscreen mode
This alone catches 80% of problems.
Layer 2: Data Quality Checks
Empty results are obvious. But what about wrong results?
def validate_scraper_output(data, schema):
errors = []
for item in data:
for field, expected_type in schema.items():
if field not in item:
errors.append(f"Missing field: {field}")
elif not isinstance(item[field], expected_type):
errors.append(f"Wrong type: {field} = {type(item[field]).__name__}, expected {expected_type.__name__}")
return errors
# Schema for a product scraper
product_schema = {
"title": str,
"price": (int, float),
"url": str,
"in_stock": bool
}
errors = validate_scraper_output(scraped_products, product_schema)
if errors:
send_alert(f"Schema validation failed: {errors[:3]}")
Enter fullscreen mode Exit fullscreen mode
Layer 3: Trend Detection
The sneakiest failures are gradual. A scraper returning 1000 results suddenly returning 500 is suspicious.
import sqlite3
def log_run(scraper_name, result_count):
conn = sqlite3.connect("scraper_metrics.db")
conn.execute("""
CREATE TABLE IF NOT EXISTS runs (
scraper TEXT, count INTEGER,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
""")
conn.execute(
"INSERT INTO runs (scraper, count) VALUES (?, ?)",
(scraper_name, result_count)
)
conn.commit()
# Check for anomalies
avg = conn.execute("""
SELECT AVG(count) FROM runs
WHERE scraper = ?
AND timestamp > datetime('now', '-7 days')
""", (scraper_name,)).fetchone()[0]
if avg and result_count < avg * 0.5:
return f"WARNING: {scraper_name} returned {result_count}, avg is {avg:.0f}"
return None
Enter fullscreen mode Exit fullscreen mode
Layer 4: Alerts That Do Not Annoy
The biggest mistake is alerting on everything. After 2 days you ignore all alerts.
My rules:
- Critical (immediate): scraper returns 0 results for 2 runs in a row
- Warning (daily digest): result count dropped 50%+
- Info (weekly report): performance trends, slow scrapers
import smtplib
from email.mime.text import MIMEText
def send_daily_digest(issues):
if not issues:
return # No news is good news
critical = [i for i in issues if i.startswith("CRITICAL")]
warnings = [i for i in issues if i.startswith("WARNING")]
body = f"""Scraper Monitor - Daily Digest
Critical ({len(critical)}):
{chr(10).join(critical) or 'None'}
Warnings ({len(warnings)}):
{chr(10).join(warnings) or 'None'}
Total scrapers: 77 | Healthy: {77 - len(issues)} | Issues: {len(issues)}
"""
# Send only if critical issues exist
if critical:
send_email("CRITICAL: Scraper failures", body)
elif warnings:
send_email("Scraper digest", body)
Enter fullscreen mode Exit fullscreen mode
The Dashboard (Optional but Satisfying)
I built a simple HTML dashboard that shows:
- Green: ran in last 24h, results > 0
- Yellow: ran but results dropped
- Red: not run or 0 results
It is just a cron job that reads heartbeat files and generates a static HTML page. No framework needed.
Results
Before
After
Found failures when clients complained
Found failures in minutes
3-4 scraper fires per week
0-1 per week
Manual checks every morning
Automated daily digest
No idea which scrapers were degrading
Trend graphs show everything
The entire system is ~200 lines of Python. No Grafana, no Prometheus, no Datadog. Just files, SQLite, and email.
What is your scraper monitoring setup?
Are you monitoring your scrapers at all? Or do you find out when things break? I am curious what approaches others use — especially for larger fleets.
I write about web scraping, Python automation, and data engineering. Follow for practical tutorials from someone running 77 scrapers in production.
Related: 130+ web scraping tools | Scraper starter template
Source: Dev.to


