Back to Blog
How I Monitor 77 Web Scrapers Without Going Crazy (My Exact Setup)

How I Monitor 77 Web Scrapers Without Going Crazy (My Exact Setup)

April 24, 20265 min read

When you have 1 scraper, monitoring is easy. Check if it runs. Done.

When you have 77 scrapers running on different schedules, extracting data from sites that change their layout every Tuesday at 3am, monitoring becomes a full-time job.

Here is the system I built to stay sane.


The Problem

Web scrapers fail silently. They do not crash with an error. They just return empty data, or stale data, or wrong data. And you only notice when a client says: "Hey, the data looks off."

I needed a monitoring system that catches:

  • Scrapers that return 0 results (site changed layout)
  • Scrapers that return the same data twice (caching issue)
  • Scrapers that take 10x longer than usual (being rate-limited)
  • Scrapers that return data in the wrong format (schema changed)

Layer 1: Health Checks (5 minutes to set up)

Every scraper writes a heartbeat file after a successful run:

import json
from datetime import datetime
from pathlib import Path

def write_heartbeat(scraper_name, result_count, duration_seconds):
    heartbeat = {
        "scraper": scraper_name,
        "timestamp": datetime.now().isoformat(),
        "result_count": result_count,
        "duration_seconds": round(duration_seconds, 2),
        "status": "ok" if result_count > 0 else "empty"
    }

    Path("heartbeats").mkdir(exist_ok=True)
    path = Path(f"heartbeats/{scraper_name}.json")
    path.write_text(json.dumps(heartbeat, indent=2))
    return heartbeat

Enter fullscreen mode Exit fullscreen mode

Then a simple checker runs every hour:

import json
from datetime import datetime, timedelta
from pathlib import Path

def check_all_scrapers(max_age_hours=24):
    issues = []
    cutoff = datetime.now() - timedelta(hours=max_age_hours)

    for hb_file in Path("heartbeats").glob("*.json"):
        data = json.loads(hb_file.read_text())
        last_run = datetime.fromisoformat(data["timestamp"])

        if last_run < cutoff:
            issues.append(f"STALE: {data['scraper']} last ran {last_run}")
        elif data["status"] == "empty":
            issues.append(f"EMPTY: {data['scraper']} returned 0 results")
        elif data["duration_seconds"] > 300:
            issues.append(f"SLOW: {data['scraper']} took {data['duration_seconds']}s")

    return issues

Enter fullscreen mode Exit fullscreen mode

This alone catches 80% of problems.


Layer 2: Data Quality Checks

Empty results are obvious. But what about wrong results?

def validate_scraper_output(data, schema):
    errors = []

    for item in data:
        for field, expected_type in schema.items():
            if field not in item:
                errors.append(f"Missing field: {field}")
            elif not isinstance(item[field], expected_type):
                errors.append(f"Wrong type: {field} = {type(item[field]).__name__}, expected {expected_type.__name__}")

    return errors

# Schema for a product scraper
product_schema = {
    "title": str,
    "price": (int, float),
    "url": str,
    "in_stock": bool
}

errors = validate_scraper_output(scraped_products, product_schema)
if errors:
    send_alert(f"Schema validation failed: {errors[:3]}")

Enter fullscreen mode Exit fullscreen mode


Layer 3: Trend Detection

The sneakiest failures are gradual. A scraper returning 1000 results suddenly returning 500 is suspicious.

import sqlite3

def log_run(scraper_name, result_count):
    conn = sqlite3.connect("scraper_metrics.db")
    conn.execute("""
        CREATE TABLE IF NOT EXISTS runs (
            scraper TEXT, count INTEGER, 
            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
        )
    """)
    conn.execute(
        "INSERT INTO runs (scraper, count) VALUES (?, ?)",
        (scraper_name, result_count)
    )
    conn.commit()

    # Check for anomalies
    avg = conn.execute("""
        SELECT AVG(count) FROM runs 
        WHERE scraper = ? 
        AND timestamp > datetime('now', '-7 days')
    """, (scraper_name,)).fetchone()[0]

    if avg and result_count < avg * 0.5:
        return f"WARNING: {scraper_name} returned {result_count}, avg is {avg:.0f}"
    return None

Enter fullscreen mode Exit fullscreen mode


Layer 4: Alerts That Do Not Annoy

The biggest mistake is alerting on everything. After 2 days you ignore all alerts.

My rules:

  1. Critical (immediate): scraper returns 0 results for 2 runs in a row
  2. Warning (daily digest): result count dropped 50%+
  3. Info (weekly report): performance trends, slow scrapers
import smtplib
from email.mime.text import MIMEText

def send_daily_digest(issues):
    if not issues:
        return  # No news is good news

    critical = [i for i in issues if i.startswith("CRITICAL")]
    warnings = [i for i in issues if i.startswith("WARNING")]

    body = f"""Scraper Monitor - Daily Digest

Critical ({len(critical)}):
{chr(10).join(critical) or 'None'}

Warnings ({len(warnings)}):
{chr(10).join(warnings) or 'None'}

Total scrapers: 77 | Healthy: {77 - len(issues)} | Issues: {len(issues)}
"""

    # Send only if critical issues exist
    if critical:
        send_email("CRITICAL: Scraper failures", body)
    elif warnings:
        send_email("Scraper digest", body)

Enter fullscreen mode Exit fullscreen mode


The Dashboard (Optional but Satisfying)

I built a simple HTML dashboard that shows:

  • Green: ran in last 24h, results > 0
  • Yellow: ran but results dropped
  • Red: not run or 0 results

It is just a cron job that reads heartbeat files and generates a static HTML page. No framework needed.


Results

Before

After

Found failures when clients complained

Found failures in minutes

3-4 scraper fires per week

0-1 per week

Manual checks every morning

Automated daily digest

No idea which scrapers were degrading

Trend graphs show everything

The entire system is ~200 lines of Python. No Grafana, no Prometheus, no Datadog. Just files, SQLite, and email.


What is your scraper monitoring setup?

Are you monitoring your scrapers at all? Or do you find out when things break? I am curious what approaches others use — especially for larger fleets.


I write about web scraping, Python automation, and data engineering. Follow for practical tutorials from someone running 77 scrapers in production.

Related: 130+ web scraping tools | Scraper starter template


Source: Dev.to

Related Posts