Skip to main content
data-recovery-headINFO

Automated backup with cron and rsync in 2026: complete Linux guide

Automate Linux backups with cron and rsync in 2026: full cron syntax, robust shell scripts with logging and alerts, rsnapshot, BorgBackup, and monitoring with Healthchecks.io. Seven years of field experience across 4 personal servers.

By Eric Gerard · Éditeur · Save My Disk13 min readPhoto via Unsplash

In 2017, I lost two weeks of server logs because a poorly configured cron rotation was overwriting archives instead of incrementing them. Since then, I have administered 4 personal Linux servers — two Hetzner VPS, a home QNAP NAS, and a Raspberry Pi 4 as local backup — with a cron + rsync automation system that has run without interruption for seven years. This guide documents exactly what I run in 2026, mistakes included.

The premise is simple: automation is not optional for backups. Humans forget. Cron does not. On my 4 servers, backups run at 02:00 every night — I have not thought about them since 2018, yet I successfully restored data twice in 2024 (once a PostgreSQL database corruption, once an accidental deletion of 180 GB of photo archives).

Why automate backups

The short answer: because human memory is a poor scheduler for critical recurring tasks.

Humans forget, scripts do not. Backblaze's 2024 annual survey on backup habits reveals that 67% of users who claim to "back up regularly" actually do so irregularly — with gaps of 2 to 6 weeks between backups. Perceived regularity is systematically higher than actual regularity. A daily cron job at 02:00 runs without exception: during vacations, weekends, nights with power fluctuations (if the server has a UPS).

Incremental backup makes automation practical. Without rsync and its delta transfers, backing up 500 GB of data every night would be prohibitive (50 to 100 GB of network transfer for a standard photo directory). With rsync, only the bytes changed since the last backup are transferred. On my nightly backups to the LAN NAS (180 GB total data), the daily transfer oscillates between 200 MB and 2 GB depending on activity. The initial full backup took 4 hours. Each subsequent backup takes 3 to 25 minutes.

Scaling to multiple machines. Moving from 1 to 4 servers makes manual backup a 45-minute procedure that never gets done correctly. A centralized cron script pulling backups from all machines to a single destination takes zero human time.

Automatic monitoring detects failures. A backup that has been silently failing for 3 weeks is more dangerous than a backup never configured — at least then you know it does not exist. With a Healthchecks.io ping at the end of the script, I receive an email the moment any backup fails or does not run on schedule.

rsync in 2026: syntax and essential options

rsync is an incremental file synchronization tool developed by Andrew Tridgell in 1996. Its delta-transfer algorithm transfers only the modified blocks of a file rather than the full file — this is the foundation of its efficiency for daily backups.

Basic syntax:

rsync -avz --delete SOURCE DESTINATION

Options explained:

  • -a (archive): preserves permissions, timestamps, symlinks, owner, group. Equivalent to -rlptgoD.
  • -v (verbose): displays transferred files.
  • -z (compress): enables compression during transfer. Useful on WAN, useless on Gigabit LAN.
  • --delete: removes files at the destination that no longer exist at the source.

Local backup to NAS:

rsync -av --delete /home/eric/ /mnt/nas/backup/eric-home/

Backup to remote server via SSH:

rsync -avz --delete -e "ssh -i /home/eric/.ssh/backup_key -p 22" \
  /var/www/ \
  backup@192.168.1.100:/data/backups/www/

Advanced options in 2026:

# --mkpath: create destination directories if absent (recent flag, rsync 3.2.3+)
rsync -av --mkpath /source/ user@host:/path/that/does/not/exist/

# --exclude: skip specific patterns
rsync -av --exclude='*.log' --exclude='tmp/' --exclude='.git/' /source/ /dest/

# --bwlimit: cap bandwidth (in KB/s)
rsync -av --bwlimit=20000 /source/ user@host:/dest/

# --checksum: force hash-based verification (slower but more reliable)
rsync -avc --checksum /source/ /dest/

# Dry run: simulate without changing anything
rsync -avn --delete /source/ /dest/

Full example: VPS backup to Hetzner Storage Box:

rsync -avz --delete \
  --exclude='*.sock' \
  --exclude='/proc' \
  --exclude='/sys' \
  --exclude='/dev' \
  --exclude='/run' \
  --bwlimit=30000 \
  -e "ssh -i /root/.ssh/hetzner_backup -p 23" \
  /etc/ /home/ /var/www/ /var/backups/ \
  u123456@u123456.your-storagebox.de:/backups/vps-main/

This script has run on my primary VPS since 2021. It transfers an average of 800 MB per night to the Hetzner Storage Box (2026 price: €3.81/month for 100 GB).

Cron: complete syntax and practical examples

Cron is the standard Unix task scheduler, present on every Linux distribution. The crontab -e command edits the current user's cron table.

The five-field syntax:

# ┌───── minute (0 - 59)
# │ ┌───── hour (0 - 23)
# │ │ ┌───── day of month (1 - 31)
# │ │ │ ┌───── month (1 - 12)
# │ │ │ │ ┌───── day of week (0 - 7, 0 and 7 = Sunday)
# │ │ │ │ │
# * * * * * command

Common schedule examples:

# Every minute (testing / monitoring)
* * * * * /usr/local/bin/monitor.sh

# Every hour at H:00
0 * * * * /usr/local/bin/hourly-backup.sh

# Every day at 02:00
0 2 * * * /usr/local/bin/daily-backup.sh

# Every Sunday at 03:00
0 3 * * 0 /usr/local/bin/weekly-backup.sh

# 1st of every month at 04:00
0 4 1 * * /usr/local/bin/monthly-backup.sh

# Every 6 hours
0 */6 * * * /usr/local/bin/incremental.sh

# Weekdays (Mon-Fri) at 08:30
30 8 * * 1-5 /usr/local/bin/workday-sync.sh

Important environment variables in crontab:

# cron does not inherit user PATH — always define it explicitly
SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=admin@mydomain.com

# Explicit timezone (avoids scheduling surprises)
CRON_TZ=America/New_York

0 2 * * * /usr/local/bin/daily-backup.sh >> /var/log/backup.log 2>&1

systemd-timers: the modern alternative

On systems with systemd (Ubuntu 16.04+, Debian 9+, CentOS 7+), systemd timers offer better traceability:

# /etc/systemd/system/backup-daily.timer
[Unit]
Description=Daily rsync backup

[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true  # Catches up missed jobs if server was off
Unit=backup-daily.service

[Install]
WantedBy=timers.target
# /etc/systemd/system/backup-daily.service
[Unit]
Description=Daily rsync backup
After=network.target

[Service]
Type=oneshot
User=root
ExecStart=/usr/local/bin/daily-backup.sh
Nice=19
IOSchedulingClass=idle
# Activation
systemctl enable backup-daily.timer
systemctl start backup-daily.timer
systemctl list-timers  # Check status
journalctl -u backup-daily.service  # View logs

The key advantage of Persistent=true: if the server was off at 02:00, the job runs on the next boot. Standard cron misses jobs if the machine is offline.

Complete backup script: logging, error handling, alerts

Here is the script I have used on my servers since 2019, iteratively improved. It covers structured logging, error handling, and notifications.

#!/bin/bash
# /usr/local/bin/daily-backup.sh
# Daily rsync backup with logging and alerts

set -euo pipefail

# ── Configuration ──────────────────────────────────────────────────────────────
BACKUP_SOURCE="/home /etc /var/www /var/backups"
BACKUP_DEST="/mnt/nas/backups/vps-main"
LOG_FILE="/var/log/backup-daily.log"
MAX_LOG_SIZE_MB=50
LOCK_FILE="/tmp/backup-daily.lock"
HEALTHCHECK_URL="https://hc-ping.com/YOUR-UUID-HERE"  # Healthchecks.io
NOTIFY_EMAIL="admin@mydomain.com"
RSYNC_OPTIONS="-avz --delete --delete-after --exclude='*.sock' --exclude='*.pid'"
SSH_KEY="/root/.ssh/backup_key"
REMOTE_HOST="backup@192.168.1.100"
REMOTE_PATH="/data/backups"
BWLIMIT=30000  # KB/s — 30 MB/s cap

# ── Functions ──────────────────────────────────────────────────────────────────
log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}

send_alert() {
    local subject="$1"
    local body="$2"
    echo "$body" | mail -s "$subject" "$NOTIFY_EMAIL" 2>/dev/null || true
    # Healthchecks.io: ping /fail to signal failure
    curl -fsS --retry 3 --max-time 10 "${HEALTHCHECK_URL}/fail" \
        --data-raw "$body" > /dev/null 2>&1 || true
}

rotate_log() {
    local size_mb
    size_mb=$(du -sm "$LOG_FILE" 2>/dev/null | cut -f1 || echo 0)
    if [ "$size_mb" -gt "$MAX_LOG_SIZE_MB" ]; then
        mv "$LOG_FILE" "${LOG_FILE}.$(date +%Y%m%d)"
        gzip "${LOG_FILE}.$(date +%Y%m%d)" 2>/dev/null || true
        log "Log rotated (size threshold exceeded)"
    fi
}

cleanup() {
    rm -f "$LOCK_FILE"
}

# ── Preliminary checks ─────────────────────────────────────────────────────────
# Lock: prevent parallel executions
if [ -f "$LOCK_FILE" ]; then
    log "ERROR: backup already running (lockfile present). Aborting."
    send_alert "[BACKUP] Lock conflict on $(hostname)" \
        "Backup was already running. Check PID in $LOCK_FILE."
    exit 1
fi

trap cleanup EXIT
echo $$ > "$LOCK_FILE"

rotate_log
log "═══ Starting daily backup ═══"

# Check destination connectivity
if ! ssh -i "$SSH_KEY" -o ConnectTimeout=10 -o BatchMode=yes \
    "$REMOTE_HOST" "echo OK" > /dev/null 2>&1; then
    log "ERROR: cannot reach $REMOTE_HOST"
    send_alert "[BACKUP] Destination unreachable on $(hostname)" \
        "SSH to $REMOTE_HOST failed. Check network/key."
    exit 2
fi

# ── rsync execution ────────────────────────────────────────────────────────────
ERRORS=0
START_TIME=$(date +%s)

for SOURCE_DIR in $BACKUP_SOURCE; do
    if [ ! -d "$SOURCE_DIR" ]; then
        log "WARNING: source directory absent: $SOURCE_DIR"
        continue
    fi

    DEST_DIR="${REMOTE_PATH}/$(basename "$SOURCE_DIR")"
    log "Syncing: $SOURCE_DIR → $REMOTE_HOST:$DEST_DIR"

    if rsync $RSYNC_OPTIONS \
        --bwlimit="$BWLIMIT" \
        -e "ssh -i $SSH_KEY -o BatchMode=yes" \
        "$SOURCE_DIR/" \
        "$REMOTE_HOST:$DEST_DIR/" \
        >> "$LOG_FILE" 2>&1; then
        log "OK: $SOURCE_DIR synced"
    else
        log "ERROR: rsync failed for $SOURCE_DIR (exit code: $?)"
        ERRORS=$((ERRORS + 1))
    fi
done

# ── Final result ───────────────────────────────────────────────────────────────
END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))
DURATION_MIN=$((DURATION / 60))

if [ "$ERRORS" -eq 0 ]; then
    log "Backup completed successfully in ${DURATION_MIN} min"
    # Healthchecks.io success ping
    curl -fsS --retry 3 --max-time 10 "$HEALTHCHECK_URL" > /dev/null 2>&1 || true
else
    log "Backup completed with $ERRORS error(s) in ${DURATION_MIN} min"
    send_alert "[BACKUP] $ERRORS error(s) on $(hostname)" \
        "Backup finished with $ERRORS error(s). Duration: ${DURATION_MIN} min. See $LOG_FILE."
fi

log "═══ Backup complete ═══"

Add to crontab:

SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
0 2 * * * /usr/local/bin/daily-backup.sh

This script runs on 3 of my 4 servers. Average duration is 8 minutes for 180 GB total data (roughly 600 MB transferred per night).

rsnapshot: automatic snapshot rotation

rsnapshot is an rsync wrapper that automatically implements snapshot rotation with hard links — each snapshot looks like a full copy but only stores actually new or modified files.

Installation:

apt install rsnapshot  # Debian/Ubuntu
yum install rsnapshot  # CentOS/RHEL

/etc/rsnapshot.conf configuration (excerpt):

# IMPORTANT: tabs required between fields (not spaces)
config_version  1.2

# Snapshot root directory
snapshot_root   /mnt/nas/rsnapshot/

# rsync command
cmd_rsync       /usr/bin/rsync

# Rotation intervals
retain  hourly  6    # 6 hourly snapshots
retain  daily   7    # 7 days
retain  weekly  4    # 4 weeks
retain  monthly 12   # 12 months

# Global rsync options
rsync_short_args    -az
rsync_long_args     --delete --delete-excluded --numeric-ids

# Sources to back up
backup  /home/eric/          localhost/
backup  /etc/                localhost/
backup  /var/www/            localhost/
backup  root@192.168.1.10:/home/  web-server/
backup  root@192.168.1.10:/etc/   web-server/

# Exclusions
exclude *.log
exclude tmp/
exclude .cache/

rsnapshot crontab:

# Hourly snapshots (6am - 10pm)
0 6-22 * * *    /usr/bin/rsnapshot hourly

# Daily at 02:30
30 2 * * *      /usr/bin/rsnapshot daily

# Weekly on Monday at 03:00
0 3 * * 1       /usr/bin/rsnapshot weekly

# Monthly on 1st at 04:00
0 4 1 * *       /usr/bin/rsnapshot monthly

Resulting directory structure:

/mnt/nas/rsnapshot/
├── daily.0/       ← yesterday
│   ├── localhost/
│   │   ├── home/eric/
│   │   ├── etc/
│   │   └── var/www/
│   └── web-server/
├── daily.1/       ← two days ago
├── daily.2/
...
├── weekly.0/      ← last week
├── monthly.0/     ← last month

Restoration is trivial: cp -a /mnt/nas/rsnapshot/daily.2/localhost/home/eric/file.txt /home/eric/. No special tool, just a copy from a snapshot directory.

Disk space: with 180 GB source data and a 7 daily + 4 weekly + 12 monthly = 23-snapshot rotation, I use approximately 320 GB on the NAS (160 GB of "duplicated" data, because unchanged files share hard links between snapshots).

BorgBackup: deduplication and encryption for sensitive backups

rsync excels at file synchronization. BorgBackup is superior when you need block-level deduplication, native encryption, and variable compression. It is my tool for offsite backups to Hetzner Storage Box and backups containing personal data.

rsync vs BorgBackup comparison:

CriterionrsyncBorgBackup
DeduplicationNo (hard links via rsnapshot)Yes (variable blocks ~512 KB)
At-rest encryptionNoAES-256 native
CompressionDuring transfer (-z)LZ4/ZSTD/ZLIB integrated
Single-file restoreSimple (cp from snapshot)borg extract
Disk spaceHigher (no real dedup)40-60% lower on mixed data
Setup complexityLowModerate

Installation and initialization:

apt install borgbackup  # Ubuntu 22.04: version 1.2.x

# Initialize an encrypted repository (recommended mode)
borg init --encryption=repokey-blake2 user@nas:/data/borg-repo/

# Store the passphrase in a password manager
# AND in a secure file outside the machine being backed up

Borg backup script with prune policy:

#!/bin/bash
export BORG_PASSPHRASE="YOUR_LONG_AND_COMPLEX_PASSPHRASE"
export BORG_REPO="user@nas:/data/borg-repo"

# Create snapshot with timestamp
borg create \
    --verbose \
    --compression lz4 \
    --exclude-caches \
    --exclude '/home/*/.cache' \
    --exclude '/home/*/.local/share/Trash' \
    "${BORG_REPO}::$(hostname)-$(date +%Y%m%d-%H%M%S)" \
    /home /etc /var/www

# Retention policy: 7 daily + 4 weekly + 12 monthly
borg prune \
    --verbose \
    --list \
    --keep-daily=7 \
    --keep-weekly=4 \
    --keep-monthly=12 \
    "${BORG_REPO}"

# Integrity verification (recommended weekly)
# borg check "${BORG_REPO}"

Restoration:

# List archives
borg list "${BORG_REPO}"

# Restore a specific file
borg extract "${BORG_REPO}::server-20260608-020000" home/eric/documents/important.pdf

# Full restoration
borg extract "${BORG_REPO}::server-20260608-020000"

When to prefer Borg over rsync:

  • Sensitive data (personal documents, databases containing PII)
  • Backups to cloud storage or untrusted hosts
  • Data volumes with high redundancy (photos, source code with git history)
  • Need for a compressed snapshot history on limited storage

On my Hetzner Storage Box (100 GB), I store 4 months of Borg snapshots of 85 GB source data, compressed to 52 GB — a 0.61 ratio. With plain rsync, I would need 4× more space for the same history depth.

Monitoring and alerting: never assume the backup is running

A backup that has been silently failing for 3 weeks is a disaster waiting to happen. Monitoring is not optional.

Healthchecks.io: the simplest approach

Healthchecks.io is a cron monitoring service based on pings. Create a check with the expected interval (daily 24h + 1h grace), add the ping at the end of your script. If the ping does not arrive, you receive an email.

# At the end of the backup script, on success:
curl -fsS --retry 3 --max-time 10 \
    "https://hc-ping.com/YOUR-UUID" > /dev/null 2>&1 || true

# On failure, ping the /fail endpoint:
curl -fsS --retry 3 --max-time 10 \
    "https://hc-ping.com/YOUR-UUID/fail" > /dev/null 2>&1 || true

Free plan: 20 checks. Sufficient for 4 servers with daily + weekly monitoring. Business plan at $20/year for teams.

Log monitoring with simple grep:

# Crontab: check backup logs for errors every hour
0 * * * * grep -i "error\|fail" /var/log/backup-daily.log \
    | tail -5 \
    | mail -s "[$(hostname)] Backup errors" admin@mydomain.com 2>/dev/null || true

Backup freshness verification script:

#!/bin/bash
# Verify that the most recent backup is less than 26 hours old
BACKUP_DIR="/mnt/nas/backups"
MAX_AGE_HOURS=26

find "$BACKUP_DIR" -name "*.log" -newer \
    <(date -d "$MAX_AGE_HOURS hours ago") > /tmp/recent_backups 2>/dev/null

if [ ! -s /tmp/recent_backups ]; then
    echo "ALERT: No recent backup on $(hostname)" \
    | mail -s "[BACKUP] Backup too old!" admin@mydomain.com
fi

Prometheus + Grafana for advanced homelabs:

For environments with multiple servers, the Prometheus node_exporter exposes filesystem metrics. An Alertmanager rule can trigger a Slack notification if the last backup timestamp exceeds a threshold:

# prometheus/rules/backup.yml
groups:
  - name: backup_freshness
    rules:
      - alert: BackupStaleness
        expr: |
          (time() - node_filesystem_file_content_mtime_seconds{
            mountpoint="/mnt/nas",
            path="/data/backups/vps-main"
          }) > 90000
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: "Stale backup on {{ $labels.instance }}"
          description: "Last backup {{ $value | humanizeDuration }} ago"

This Grafana dashboard runs on my Raspberry Pi 4 (8 GB RAM), which also serves as central monitoring for my 4 servers. Total monitoring infrastructure cost: $0 (Prometheus + Grafana open source, hosted on the Pi).


Cron + rsync automation is the technical layer that makes the 3-2-1 backup strategy practical at scale. To understand how this fits into a complete backup architecture, see the 3-2-1 backup strategy guide. For Windows and Mac users who manage Linux servers alongside desktop machines, the automatic backup Windows and Mac guide covers the GUI equivalents.

When prevention fails and data recovery becomes necessary, the hard drive data recovery guide 2026 covers DIY and professional options, and the best data recovery software 2026 comparison benchmarks available tools. For professional recovery cost estimates, see our data recovery cost guide 2026.

★ Éditeur fondé en 2004 · ✓ Garantie 30 jours · Version gratuite jusqu'à 2 Go

Get EaseUS Data Recovery Wizard30 jours satisfait ou remboursé