Self-Hosting for Academics

April 5, 2026 · 87 min

Table of Contents

Self-Hosting for Academics: A Complete Guide to Building Your Own Digital Infrastructure v1.0
Part 0: The Website
Part 1: WireGuard VPN + Pi-hole
Part 2: Tailscale + Storage Box
Part 3: Offline Library
Part 4: Miniflux RSS Reader
Part 5: Telegram Automation
Browser Hardening: Your Device, Not Just Your Server
What This Costs
What This Will Take
What This Will Not Do
A Suggested Order of Operations
Appendix: Optional Services

Self-Hosting for Academics: A Complete Guide to Building Your Own Digital Infrastructure v1.0

N.B: If you come across any errors, or have any suggestions on how to improve this, please don’t hesitate to contact me.

What This Is

This guide walks through the construction of a personal digital infrastructure — step by step, concept by concept — for an academic reader who has no background in systems administration. It assumes you know how to use a computer, install software, and navigate a web browser, but nothing beyond that.

What started as a website on a rented server grew into something larger: a private knowledge infrastructure that handles encrypted browsing, ad blocking, DNS privacy, file synchronisation, scholarly RSS reading, automated research digests, an offline library of 60,000+ public domain books, a search engine for that library, a PDF toolkit, encrypted backups, and health monitoring — all for roughly nine to ten dollars a month.

Each section is modular — you can stop at any layer and still have something useful.

Who This Is For

This guide is written for one person — the person who built it — as a reference for maintaining and extending the infrastructure. It is detailed enough to be reproducible from scratch on a fresh Ubuntu VPS, but it assumes basic comfort with the command line, SSH, and Docker. It does not assume systems administration expertise; the entire thing was built by a qualitative sociologist with the assistance of AI coding tools, not by a professional sysadmin.

If you are an academic, researcher, journalist, or anyone else who wants to reduce platform dependency while maintaining a functional digital workflow, the architecture described here may be useful as a model. The specific tools can be swapped — Miniflux for FreshRSS, Syncthing for Nextcloud, Hugo for Jekyll — but the underlying pattern is consistent: rent a cheap Linux server, run open-source services in Docker containers, connect your devices via a private mesh network, and document everything so you can rebuild it if the server disappears.

How to Use This Guide

The guide is organized chronologically — each part builds on the one before it, roughly in the order things were actually set up. Part 0 covers the website that motivated renting the server in the first place. Parts 1 through 5 add layers of infrastructure on top of it. The Backups section applies to everything. The Appendix describes optional services that can be added independently.

You don’t need to build all of this. Each part is self-contained enough to be useful on its own. A VPS with just a website and a VPN (Parts 0–1) is already a significant improvement over the default platform arrangement. Add services as the need arises, not all at once.

Before You Start: Vocabulary

If you have never administered a server, the terminology can be a barrier. Here is what the key terms mean, in plain language.

Server: A computer that is always on, always connected to the internet, and runs software that responds to requests from other devices. In this guide, “server” means a computer you rent from a company in a data centre — not a machine in your office.

VPS (Virtual Private Server): A slice of a physical server in a data centre that behaves, for your purposes, as if it were your own dedicated computer. You connect to it remotely, install software on it, and it runs 24/7. Think of it as renting a small apartment inside a large building: you have your own keys and walls, but you share the building’s plumbing and electricity.

SSH (Secure Shell): The way you connect to your server. You type commands into a text-based interface (a “terminal”) on your laptop, and those commands execute on the remote server. It looks like a black screen with white text. It is less intimidating than it appears.

Docker: A tool that lets you run software in isolated packages called “containers.” Each container holds one application and everything it needs to function. This means you can run a dozen different services on the same server without them interfering with each other, and if one crashes, the others keep working. Docker is the single most important tool in this guide — it turns complex software installations into one-line commands.

Docker Compose: A file (written in a simple format called YAML) that describes which containers to run and how they should talk to each other. Instead of installing and configuring each service manually, you write one file and Docker sets everything up. The Compose file is, in effect, a blueprint for your entire infrastructure.

DNS (Domain Name System): The system that translates human-readable web addresses (like google.com) into numerical addresses that computers use to find each other. Every time you visit a website, your device first asks a DNS server “what is the numerical address of this domain?” This happens invisibly, dozens or hundreds of times per hour. Whoever handles your DNS queries can see every website you visit.

VPN (Virtual Private Network): A tool that creates an encrypted tunnel between your device and a server somewhere else. All your internet traffic travels through this tunnel, which means anyone watching your local connection (your internet service provider, your university network, a café’s Wi-Fi) sees only encrypted data going to one destination. They cannot directly see which websites you visit or what data you send.

Cron job: A scheduled task that runs automatically at a set time — like an alarm clock for your server. For example, “run the backup script at 3 AM every night.” You set it once and it runs without intervention.

Port: A numbered channel on a server through which a specific service communicates. Think of it like apartment numbers in a building: the server is the building, and each service answers at a different port number.

A Note on Operating Systems

This guide assumes you are working from a Mac. All local commands use macOS tools (brew, Terminal.app, macOS file paths). If you use Linux, the commands are largely identical. If you use Windows, you will need WSL (Windows Subsystem for Linux) or PuTTY for SSH, and some local commands will differ — the guide does not cover these differences.

The VPS itself runs Ubuntu Linux. All commands that begin with ssh are executed on the remote server, regardless of your local operating system.

Terminal Commands You Will Use Repeatedly

If you have never used a terminal, here is what the recurring commands do:

cd ~/directory — change into a directory. ~ means your home folder. cd ~/vpn means “go to the vpn folder in my home directory.”

nano filename — open a file for editing in a simple text editor. Save with Ctrl+O, exit with Ctrl+X.

cat filename — print the contents of a file to the screen. Useful for checking what’s inside a config file before changing it.

ls — list the files in the current directory. ls -la shows hidden files and permissions.

mkdir -p ~/directory — create a directory (and any parent directories that don’t exist yet).

cp source destination — copy a file. mv source destination — move or rename a file.

sudo command — run a command as the system administrator. Required for installing software, editing system files, and managing services. You will be prompted for your password.

docker compose up -d — start all containers defined in the current directory’s docker-compose.yml. The -d flag runs them in the background.

docker compose down — stop all containers in the current stack.

docker ps — list all running containers. docker logs containername — show a container’s recent output.

ssh user@ip — connect to a remote server. This is how you access your VPS from your laptop.

scp file user@ip:path — copy a file from your laptop to the server (or vice versa).

chmod 600 file — restrict a file’s permissions so only you can read it. Used for secrets and keys.

These ten commands account for roughly 90% of what this guide asks you to do. Everything else is explained in context.

Where Am I Running This?

This guide constantly switches between two machines: your laptop (the local machine) and the VPS (the remote server). If you lose track of which one you’re on, things will break or fail silently. Here is how to tell.

On your VPS (after running ssh YOUR_USER@YOUR_VPS_IP):

Your terminal prompt will show the VPS hostname (e.g., YOUR_USER@vps:~$)
This is where you create Docker Compose files, launch containers, edit configs, run health checks, and manage backups
Almost everything in Parts 1–5 happens here
Type exit to disconnect and return to your laptop

On your laptop (your local terminal, no SSH):

Your terminal prompt will show your Mac’s name (e.g., yourname@your-mac:~$)
This is where you edit your Hugo site, run deploy.sh, generate SSH keys, open SSH tunnels, and install local tools like Hugo and Tailscale
Part 0 (the website) happens entirely here
SSH tunnels (e.g., ssh -L 8090:127.0.0.1:8090 YOUR_USER@YOUR_VPS_IP -N) are run from here — they connect your local browser to a service on the VPS

The rule of thumb: if the command starts with ssh, you are about to go to the VPS or creating a tunnel to it. If you are already inside an SSH session and the command starts with docker, nano, sudo, or cd ~/, you are working on the VPS. If you see hugo, brew, deploy.sh, or references to your local file paths (e.g., ~/academic-site/), you are on your laptop.

When in doubt, run hostname — it prints the name of the machine you are currently on.

What You Are Replacing, and Why

Here is what many academics pay for monthly, often without thinking about it:

Cloud storage (iCloud, Google Drive, Dropbox): $5–15/month. These services sync your files across devices by uploading them to the company’s servers. The company can read your files, scan them, and change the terms of service at any time. You are paying for the convenience of not running your own sync tool.

Website hosting (Squarespace, WordPress.com, Wix): $10–20/month. These services host your academic website on their infrastructure. You cannot inspect how your site is served, what data is collected about visitors, or move your content easily if prices change.

RSS / news curation (social media, email newsletters): $0 in money, but you pay in attention and data. Algorithmic feeds decide what scholarship you see, when you see it, and in what order. You have no control over the ranking logic.

PDF tools (online converters, Adobe Acrobat): $0–15/month. Every time you upload a document to an online PDF tool, you are sending your work to a stranger’s server.

Ad-blocking (browser extensions only): $0, but incomplete. Browser-level ad blockers only work inside the browser. They do not block tracking by apps on your phone or by the operating system itself.

The infrastructure described below replaces all of these with tools you control, running on a server you rent, for roughly $9–10/month total.

Using AI to Help You Build

You do not need to be a programmer to follow this guide. If you have ever customised a LaTeX template, debugged a reference manager, or configured a course on an LMS, you have the disposition required. The specific skills can be learned as you go.

This infrastructure was built with the help of an AI coding assistant, and you may do the same. Describing what you want to an LLM (Claude, ChatGPT, or similar) and iteratively revising the code it produces is a viable method for setting up Docker containers, writing configuration files, and troubleshooting errors. The important thing is to audit what the AI produces — read the configuration, understand what each line does, and cross-reference against official documentation. The AI handles syntax; you handle intent and verification.

What This Is — and What It Is Not

This guide describes personal privacy infrastructure and personal research infrastructure. These are related but distinct things, and conflating them leads to overclaiming.

Privacy infrastructure reduces the number of third parties who can observe your digital activity. The VPN, Pi-hole, and dnscrypt-proxy encrypt and filter your traffic so that your ISP, ad networks, and casual observers see less. This is real and measurable — but it is reduction, not elimination.

Research infrastructure provides self-controlled tools for scholarly work. The RSS reader, the Gutenberg library, the search app, the PDF toolkit, the file sync — these are workflow tools that happen to run on your own hardware instead of someone else’s. Their value is independence from platform lock-in, data sovereignty over your own materials, and the capacity to inspect and modify every layer of the stack. This is the dimension that has no commercial equivalent: no platform sells you the ability to understand and meaningfully reconfigure your own infrastructure as an integrated system.

Anonymity infrastructure is something this guide does not build. Anonymity means an adversary cannot determine your identity even with access to the traffic. This requires Tor, multi-hop routing, careful operational security, and behavioral discipline that goes far beyond what is described here. Your VPS has a static IP registered to your name. Your VPS provider can associate all traffic with your billing identity. You are private but not anonymous — and the distinction matters.

Security infrastructure at the enterprise level involves network segmentation, intrusion detection systems, centralized log aggregation, key management services, multi-factor authentication on every layer, regular penetration testing, and dedicated security teams. This guide does none of that. It runs a dozen Docker containers on a single $5 VPS with fail2ban and a firewall. The attack surface is small because the infrastructure is small — one user, one server, one purpose. If this were a production system serving paying customers, the security posture described here would be inadequate. For a personal research stack accessed over a private mesh network, it is proportionate.

Threat Model

Being explicit about what this infrastructure protects against — and what it does not — prevents the guide from making promises it cannot keep.

What it protects against:

Your ISP logging which websites you visit (VPN encrypts all traffic)
Advertising networks tracking you across apps and devices (Pi-hole blocks at DNS level)
DNS queries being readable by intermediaries (dnscrypt-proxy encrypts them in transit to the resolver)
Platform lock-in and unilateral terms-of-service changes (self-hosted, portable stack)
Data loss from provider shutdown (encrypted backups, documented configs)
Casual surveillance of your scholarly reading habits, search patterns, and file contents

What it does not protect against:

Your VPS provider (Hetzner knows your identity and can comply with German court orders)
Traffic analysis (an observer can see encrypted packet timing and volume, even without reading content)
TLS metadata leakage (Server Name Indication exposes which domains you visit unless ECH — Encrypted Client Hello — is enabled, which is not yet widely deployed)
Compromise of the VPS itself (if the server is breached, everything on it is exposed)
Compromise of your Tailscale account (this grants network access to all services)
A determined state-level adversary with the resources to correlate traffic patterns across providers

What it explicitly does not attempt:

Anonymity (your IP is static and registered to you)
Anti-forensics (volatile logging helps, but a motivated adversary with host access can still examine running processes and memory)
High-risk activism support (journalists, dissidents, and whistleblowers need Tor, Tails, and operational security practices that are beyond the scope of this guide)

Known Weaknesses

No infrastructure guide should pretend its subject has no flaws. These are the ones that matter:

Single point of failure. Everything runs on one VPS. If that server goes down, every service goes down simultaneously. Enterprise architecture addresses this with redundancy, failover, and multi-region deployment. For personal infrastructure, the mitigation is simpler: encrypted backups on a separate storage box, documented configurations that can be redeployed on any provider within an hour, and the acceptance that occasional downtime is tolerable for a stack that serves one person.

Docker image trust. The guide uses latest tags for Docker images, which means every docker compose pull could introduce changes you haven’t reviewed. A malicious or broken update to any upstream image could compromise the service. The enterprise practice is to pin specific image versions and update deliberately after testing. For personal use, the practical recommendation is: pin versions for critical services (VPN, Pi-hole, backup tools) and accept latest for low-risk services (Stirling PDF, Excalidraw). Always check changelogs before pulling updates.

Privileged containers. The wg-easy container runs with NET_ADMIN and SYS_MODULE capabilities because WireGuard requires kernel-level network access. A container escape from this container would grant host-level privileges. There is no mitigation short of running WireGuard outside Docker entirely (which adds different complexity). This is a known trade-off of Docker-based VPN setups.

Secrets in scripts. The backup script contains the Borg passphrase in plaintext. The .env file contains API keys. Both are protected by file permissions (chmod 600 / chmod 700), but they exist on disk as readable text. Enterprise infrastructure uses dedicated secrets managers (HashiCorp Vault, AWS Secrets Manager). For personal use, strict file permissions and awareness of the risk are the proportionate response — but never commit these files to a Git repository.

No restore testing. Backups exist and run nightly, but unless you periodically test the restore process, you cannot be certain they work. Add a calendar reminder: every three months, spin up a test environment and restore from the latest Borg archive. A backup that has never been restored is a hope, not a plan.

Docker can bypass UFW. This is a well-documented and widely misunderstood interaction: Docker manipulates iptables directly, adding its own FORWARD rules that bypass UFW’s INPUT chain. This means a port exposed in a Docker Compose file may be reachable from the internet even if UFW has no rule allowing it. The mitigation used throughout this guide is to bind every service to either 127.0.0.1 (localhost only) or YOUR_TAILSCALE_IP (mesh only), so Docker never exposes a port on all interfaces. This is more reliable than relying on UFW to block Docker-exposed ports. If you add new services, always specify the bind address explicitly in the ports directive — never use bare "8080:8080" without an IP prefix.

Environment variable exposure. The run.sh wrapper uses set -a to export all variables from .env into the Python process’s environment. This means API keys and tokens are visible in /proc/<pid>/environ to any process running as the same user. On a single-user VPS this is the expected threat surface, but be aware that a compromised process running as your user can read all secrets from any other process’s environment.

A Note on Redaction

All IPs, domain names, API endpoints, onion addresses, and personally identifying details have been scrubbed from this guide and replaced with placeholder variables (e.g., YOUR_VPS_IP, YOUR_TAILSCALE_IP, YOUR_DOMAIN, YOUR_USER). Part 0 (the website) is intentionally less detailed than other sections because it describes the only public-facing component of the infrastructure. The guide is safe to store, share, or publish — but the actual configuration files on the server contain the real values and should be treated accordingly.

Disclaimer — Read Before Using This Guide

This document describes a personal, single-user infrastructure built for a specific and limited threat model. It is provided for educational and informational purposes only. It is not a production-ready system, not a comprehensive security framework, and not suitable for high-risk contexts (including journalism, activism, or any setting requiring anonymity or adversary-resistant operational security).

Do not copy or deploy this guide verbatim. Many components require adaptation to your environment, careful configuration, and ongoing maintenance. Misconfiguration — especially of networking, Docker port bindings, authentication, or firewall rules — can expose services to the public internet and result in data loss or system compromise.

This setup prioritizes accessibility and independence over hardening. It does not protect against a compromised VPS, account takeover (e.g., Tailscale), traffic analysis, or a determined adversary. Secrets may be stored locally (e.g., in environment files or scripts), and security practices described here are appropriate only for a single-user system under a modest threat model. You are solely responsible for any system you build using this guide. Before deployment, you should understand each component, review official documentation, pin versions for critical services, and implement additional safeguards appropriate to your use case.

Part 0: The Website — Hugo, PaperMod, Nginx, Let’s Encrypt, deploy script
Part 1: VPN + Ad Blocking — WireGuard, Pi-hole, dnscrypt-proxy, Syncthing
Part 2: Tailscale + Storage Box — Private mesh network, SSHFS mount, Stirling PDF
Part 3: Offline Library — Kiwix, Project Gutenberg collection, search app
Part 4: RSS Reader — Miniflux with 100 academic feeds
Part 5: Telegram Automation — Daily digest and VPS health monitor
Backups — BorgBackup to Hetzner Storage Box
Appendix — Optional services (Uptime Kuma, Ntfy, Gitea, Excalidraw, PrivateBin, CyberChef)

Part 0: The Website

Note: This section is intentionally less specific than the rest of the guide. The website is the only public-facing component of the infrastructure, and detailed server configurations, form endpoints, and domain-specific settings are redacted to avoid creating unnecessary exposure. The workflow and architecture are described fully; the implementation details are kept private.

Everything started here. The VPS was rented to host a personal academic website — a static site built with Hugo, themed with PaperMod, edited locally in Obsidian, and deployed via rsync. Every other service in this guide grew from the fact that the server already existed.

How the Site Works

The site is a static HTML site generated by Hugo, a fast open-source static site generator. Content is written in Markdown with TOML frontmatter (+++ delimiters), organized into pages and blog posts. The PaperMod theme provides the layout, dark mode, reading time, breadcrumbs, and responsive design. Hugo compiles everything into a public/ directory of plain HTML, CSS, and assets — no database, no PHP, no server-side processing.

Nginx serves the static files on the VPS. Let’s Encrypt provides HTTPS certificates, auto-renewed by Certbot. A deploy script builds the site locally and rsyncs the output to the server.

Local Setup

Prerequisites

Install Hugo on your Mac:

brew install hugo

Project Structure

mysite/
├── hugo.toml                          # Site configuration
├── content/
│   ├── _index.md                      # Home page
│   ├── research.md                    # Research page
│   ├── teaching.md                    # Teaching page
│   ├── contact.md                     # Contact form
│   ├── privacy.md                     # Privacy policy
│   └── posts/
│       ├── _index.md                  # Blog index with search + subscribe
│       └── *.md                       # Blog posts
├── layouts/
│   ├── partials/
│   │   ├── footer.html                # Footer override (custom links)
│   │   ├── extend_head.html           # Empty (analytics removed)
│   │   └── extend_footer.html         # Empty
│   └── shortcodes/
│       ├── rawHTML.html               # Allows raw HTML in Markdown
│       ├── news-subscribe.html        # Email signup form
│       └── postsearch.html            # Client-side blog search
├── static/
│   ├── css/custom.css                 # Custom styles
│   ├── files/                         # PDFs (syllabi, papers)
│   └── media/                         # Images
├── themes/
│   └── PaperMod/                      # Theme
├── deploy.sh                          # Build + rsync to VPS
└── public/                            # Generated output (not committed)

Configuration

The site is configured in hugo.toml. Key settings:

baseURL = "https://YOUR_DOMAIN/"
theme = "PaperMod"
enableRobotsTXT = true

[markup.goldmark.renderer]
  unsafe = true                        # Required for raw HTML in Markdown

[params]
  defaultTheme = "dark"
  disableThemeToggle = true
  customCSS = ["css/custom.css"]
  ShowReadingTime = true
  ShowBreadCrumbs = true

The unsafe = true setting allows raw HTML inside Markdown files — needed for contact forms, collapsible sections, and inline styling.

Content Conventions

Pages use TOML frontmatter with +++ delimiters (not YAML ---):

+++
title = "Page Title"
draft = false
showDate = false
showReadingTime = false
showWordCount = false
type = "page"
layout = "page"
+++

Blog posts add date and optional tags:

+++
title = "Post Title"
date = 2025-09-03
draft = false
hiddenInHomeList = true
tags = ["tag1", "tag2"]
+++

Shortcodes

Three custom shortcodes in layouts/shortcodes/:

rawHTML.html — wraps raw HTML so Hugo doesn’t escape it. Used for forms and custom layouts.
news-subscribe.html — email subscription form powered by a third-party newsletter service. Takes optional tag and success parameters.
postsearch.html — client-side blog search. Fetches a JSON index generated by Hugo and searches titles, tags, and summaries with debounced input. Press / to focus.

Theme Overrides

Three files in layouts/partials/ override PaperMod defaults:

footer.html — copied from the theme and modified to add custom links (Tor mirror, privacy page) to the site footer.
extend_head.html — empty. Previously contained analytics; removed for GDPR compliance. The file must exist as an empty override — deleting it would cause Hugo to fall back to the theme’s default, which may not be empty. Future <head> additions go here.
extend_footer.html — empty. Same logic: exists as an intentional override to prevent the theme from injecting unwanted content. Available for future footer additions.

Third-Party Services

The site uses two external services for form handling and email subscriptions. Both are US-based and identified in the site’s privacy policy with links to their respective privacy policies. No analytics, no cookies, no tracking scripts.

Server Setup

Nginx serves static files from the webroot with HTTPS via Let’s Encrypt. The configuration includes an Onion-Location header so Tor Browser users are prompted to switch to the .onion mirror. Certbot handles certificate issuance and auto-renewal.

# Install
sudo apt install nginx certbot python3-certbot-nginx

# Get certificates
sudo certbot --nginx -d YOUR_DOMAIN -d www.YOUR_DOMAIN

# Create webroot
sudo mkdir -p /var/www/html
sudo chown -R $USER:$USER /var/www/html

Certbot modifies the Nginx config automatically and sets up auto-renewal via a systemd timer.

Deploying

A deploy script builds the site locally and rsyncs the output to the VPS:

#!/usr/bin/env bash
set -euo pipefail

REMOTE="YOUR_USER@YOUR_VPS_IP"
WEBROOT="/var/www/html"

hugo --minify --environment production

rsync -azv --delete --progress \
  -e "ssh" \
  public/ "${REMOTE}:${WEBROOT}/"

The --delete flag removes files on the server that no longer exist locally — important when deleting a page or removing a script. Without it, stale files persist in the webroot.

IMPORTANT: Clear public/ Before Rebuilding After Deletions

Hugo doesn’t always clean up deleted files from public/. If you remove a tracking script or delete a page, the old output may persist:

rm -rf public/
hugo --minify --environment production
./deploy.sh

Editing Workflow

Edit content files in Obsidian (or any text editor) — they’re plain Markdown
Preview locally: hugo server -D (the -D flag includes drafts)
Open http://localhost:1313 to see the live preview
When satisfied, run ./deploy.sh
Site is live within seconds

All editing happens on the local machine. The VPS is never edited directly — it only receives the built output via rsync.

The site was cleaned up for GDPR compliance:

Analytics removed — the extend_head.html partial was emptied. The public/ directory was cleared and rebuilt to ensure no stale tracking scripts remained in the deployed output.
Privacy page created — content/privacy.md identifies both third-party form/newsletter services as US-based data processors with links to their privacy policies. States that no analytics or cookies are used and that server logs are volatile.
Privacy link in footer — layouts/partials/footer.html overrides the theme footer to add a “Privacy” link.

A second site managed on the same VPS required no GDPR action — it has no analytics, no forms, and no third-party services.

Part 1: WireGuard VPN + Pi-hole

This is the privacy layer — the foundation of the entire infrastructure. It has three components that work together. The VPN wraps all your internet traffic in an encrypted tunnel. The ad-blocker intercepts unwanted tracking requests inside that tunnel. The DNS encryption ensures that even your domain lookups are private. Think of it as three concentric walls.

Why this matters for academics: If you work on politically sensitive topics, access paywalled resources from insecure networks, or simply prefer that your ISP not have a direct log of the specific sites you visit, this layer provides meaningful protection. The Pi-hole dashboard will also show you, in real time, every domain your devices are trying to reach — which is itself an education in how pervasive commercial surveillance infrastructure actually is.

A complete guide to setting up a private VPN tunnel with DNS-level ad/tracker blocking through your Hetzner Germany VPS, using wg-easy (WireGuard with a web GUI) and Pi-hole.

What you get at the end:

All VPN-routed traffic encrypted through Germany (subject to split tunneling configuration)
Ads and trackers blocked across all apps (not just browsers)
DNS queries resolved on your own server, forwarded to Quad9 (Swiss-based non-profit DNS provider) over encrypted DNS-over-HTTPS — no plaintext DNS in typical operation, barring misconfiguration or fallback conditions
Admin panels accessible only via SSH tunnel — no publicly exposed web admin interfaces
Runs on your existing VPS at zero additional cost

Prerequisites

Hetzner VPS running Ubuntu 24 (or similar Debian-based distro)
SSH access to the VPS
Docker and Docker Compose installed (see Step 1 if not)

Step 1: Install Docker (skip if already installed)

SSH into your VPS:

ssh your-username@YOUR_VPS_IP

Install Docker:

curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER

Log out and back in for the group change to take effect:

exit
ssh your-username@YOUR_VPS_IP

Install Docker Compose and verify:

sudo apt install docker-compose-plugin
docker --version
docker compose version

Step 2: Open the WireGuard Port

WireGuard uses UDP port 51820. Open only this port — the admin panels stay closed and are accessed securely via SSH tunnel instead.

sudo ufw allow OpenSSH
sudo ufw allow 51820/udp
sudo ufw enable
sudo ufw status

Important: ufw enable activates the firewall. On a fresh Hetzner VPS, UFW is installed but inactive by default — meaning ufw allow rules exist on paper but are not enforced until you run ufw enable. Always allow SSH (OpenSSH) before enabling, or you will lock yourself out.

If you’re also using Hetzner’s cloud firewall (check Hetzner Cloud Console → your server → Firewalls), add one inbound rule:

Protocol: UDP, Port: 51820, Source: Any

Do not open ports 51821 (wg-easy GUI) or 80 (Pi-hole GUI) in either firewall.

Step 3: Generate a Password Hash

wg-easy requires a bcrypt hash rather than a plaintext password. Generate one:

docker run -it ghcr.io/wg-easy/wg-easy wgpw YOUR_PASSWORD_HERE

This outputs a hash like:

$2a$12$cIBKkxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Copy this hash — you’ll need it in Step 4. Remember the plaintext password you used; that’s what you’ll type to log in.

Step 4: Create the Docker Compose File

Create a project directory:

mkdir ~/vpn && cd ~/vpn
nano docker-compose.yml

Paste this configuration:

services:
  wg-easy:
    image: ghcr.io/wg-easy/wg-easy
    container_name: wg-easy
    environment:
      - WG_HOST=YOUR_VPS_IP
      - PASSWORD_HASH=YOUR_HASH_HERE
      - WG_DEFAULT_DNS=10.8.1.3
      - WG_ALLOWED_IPS=0.0.0.0/0
    volumes:
      - ~/.wg-easy:/etc/wireguard
    ports:
      - "51820:51820/udp"
      - "127.0.0.1:51821:51821/tcp"
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    sysctls:
      - net.ipv4.conf.all.src_valid_mark=1
      - net.ipv4.ip_forward=1
    restart: unless-stopped
    networks:
      vpn_net:
        ipv4_address: 10.8.1.2

  pihole:
    image: pihole/pihole:latest
    container_name: pihole
    environment:
      - WEBPASSWORD=CHOOSE_A_PIHOLE_PASSWORD
      - DNSMASQ_LISTENING=all
      - PIHOLE_DNS_=10.8.1.4#5053
    volumes:
      - ./pihole/etc-pihole:/etc/pihole
      - ./pihole/etc-dnsmasq.d:/etc/dnsmasq.d
    restart: unless-stopped
    networks:
      vpn_net:
        ipv4_address: 10.8.1.3

  dnscrypt:
    image: klutchell/dnscrypt-proxy:latest
    container_name: dnscrypt
    volumes:
      - ./dnscrypt/dnscrypt-proxy.toml:/config/dnscrypt-proxy.toml
    restart: unless-stopped
    networks:
      vpn_net:
        ipv4_address: 10.8.1.4

networks:
  vpn_net:
    ipam:
      config:
        - subnet: 10.8.1.0/24

Replace three things before saving:

Placeholder	Replace with
`YOUR_VPS_IP`	Your Hetzner VPS public IPv4 address
`YOUR_HASH_HERE`	Your bcrypt hash from Step 3
`CHOOSE_A_PIHOLE_PASSWORD`	A password for Pi-hole’s admin dashboard

CRITICAL: Escape the $ signs in your hash. Docker Compose interprets $ as variable references. Double every $ in the hash. For example:

# Original hash:
$2a$12$cIBKkxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

# In docker-compose.yml (every $ becomes $$):
$$2a$$12$$cIBKkxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

If you skip this, the hash gets corrupted and you’ll get “Unauthorized” when trying to log in.

Save and exit: Ctrl+O, Enter, Ctrl+X.

How the DNS chain works:

WG_DEFAULT_DNS=10.8.1.3 points WireGuard clients to Pi-hole’s internal IP. Pi-hole resolves queries locally (blocking ads/trackers) and forwards the rest to dnscrypt-proxy (10.8.1.4), which encrypts them using DNS-over-HTTPS before sending to Quad9. The entire DNS chain is encrypted — no plaintext DNS in typical operation, barring misconfiguration or fallback conditions.

Your device → WireGuard tunnel (encrypted) → Pi-hole (blocks ads) → dnscrypt-proxy (encrypts DNS via DoH) → Quad9 (resolves)

What each setting does:

Setting	Purpose
`WG_HOST`	Your VPS public IP — clients connect to this
`PASSWORD_HASH`	Bcrypt hash protecting the web admin panel
`WG_DEFAULT_DNS=10.8.1.3`	Points client DNS to Pi-hole
`WG_ALLOWED_IPS=0.0.0.0/0`	Route ALL client traffic through VPN
`PIHOLE_DNS_=10.8.1.4#5053`	Pi-hole forwards to dnscrypt-proxy (DNS-over-HTTPS proxy)
`DNSMASQ_LISTENING=all`	Pi-hole accepts DNS queries from the Docker network
`dnscrypt-proxy` upstream	Encrypts DNS queries to Quad9 (`dns.quad9.net`) via HTTPS
Port 51820/udp	WireGuard tunnel
Port 51821/tcp	wg-easy admin panel (only via SSH tunnel)
`10.8.1.0/24` network	Internal Docker network connecting the three containers

Step 5: Create the dnscrypt-proxy Config

dnscrypt-proxy needs a config file to know which upstream DNS server to use:

mkdir -p ~/vpn/dnscrypt
nano ~/vpn/dnscrypt/dnscrypt-proxy.toml

Paste:

listen_addresses = ['0.0.0.0:5053']
server_names = ['quad9-doh-ip4-port443-nofilter-ecs-pri']

[sources]
  [sources.'public-resolvers']
    urls = ['https://raw.githubusercontent.com/DNSCrypt/dnscrypt-resolvers/master/v3/public-resolvers.md', 'https://download.dnscrypt.info/resolvers-list/v3/public-resolvers.md']
    cache_file = '/config/public-resolvers.md'
    minisign_key = 'RWQf6LRCGA9i53mlYecO4IzT51TGPpvWucNSCh1CBM0QTaLn73Y7GFO3'

Save and exit: Ctrl+O, Enter, Ctrl+X.

This tells dnscrypt-proxy to listen on port 5053 and forward all DNS queries to Quad9 over encrypted DNS-over-HTTPS.

Step 6: Launch Everything

cd ~/vpn
docker compose up -d

Verify all three containers are running:

docker ps

You should see wg-easy, pihole, and dnscrypt all with status Up.

Secure the WireGuard client keys and the Compose file (which contains your Pi-hole password):

chmod 700 ~/.wg-easy
chmod 600 ~/vpn/docker-compose.yml

If any container is in a Restarting state, check its logs:

docker logs wg-easy
docker logs pihole
docker logs dnscrypt

Step 7: Create Client Configs via SSH Tunnel

Since port 51821 is not exposed to the internet, you access the web GUI through an encrypted SSH tunnel.

Open a second terminal window on your laptop (keep your VPS session in the first) and run:

ssh -L 51821:localhost:51821 your-username@YOUR_VPS_IP

This forwards your laptop’s port 51821 through SSH to the VPS. Keep this terminal open.

Now open your browser and go to:

http://localhost:51821

Enter the plaintext password you used in Step 3 (not the hash)
Click "+ New"
Name your first client (e.g., laptop, phone, tablet)
A config file and QR code are generated automatically

Repeat for each device you want to connect.

When you’re done, you can close the SSH tunnel (Ctrl+C). The VPN keeps running — you only need the tunnel when managing clients.

Step 8: Connect Your Devices

Laptop (macOS / Windows / Linux)

Download the WireGuard app:
- macOS: App Store → search “WireGuard”
- Windows: https://www.wireguard.com/install/
- Linux: sudo apt install wireguard
In the wg-easy web GUI, click the download icon next to your laptop client
This downloads a .conf file
Open WireGuard app → “Import Tunnel from File” → select the .conf file
Click Activate

Phone (Android / iPhone)

Install the WireGuard app from Play Store or App Store
In the wg-easy web GUI, click the QR code icon next to your phone client
Open WireGuard app on phone → tap + → Scan from QR code
Point camera at the QR code on your screen
Toggle the tunnel on

Phone tips:

Android: Add a Quick Settings tile (swipe down → edit → drag WireGuard tile) for one-tap toggling. You can also exclude apps from the VPN: tunnel settings → Excluded Applications → select banking/UPI apps.
iPhone: No per-app exclusion (iOS limitation). Use On-Demand rules instead: tunnel settings → On Demand → auto-activate on untrusted wifi, deactivate on home wifi. Toggle VPN off manually when using banking apps.

Step 9: Verify Everything Works

Check your IP:

Visit https://whatismyipaddress.com — it should show a German IP address (Hetzner’s range), not your home ISP.

Or from terminal:

curl ifconfig.me

Check for DNS leaks:

Visit https://dnsleaktest.com — click Extended Test. The results should show a single server in Germany. It may display as Cloudflare Frankfurt rather than Quad9 — this can be normal due to routing through shared infrastructure, but it can also indicate a misconfiguration. If you see unexpected results, verify with dnscrypt-proxy logs: docker logs dnscrypt 2>&1 | tail -20. The important thing is: one server, in Germany, not your home ISP’s DNS servers.

Check Pi-hole is blocking ads:

Visit https://ads-blocker.com/testing/ — most test ads should be blocked.

Or from terminal:

nslookup ads.google.com

If Pi-hole is working, this returns 0.0.0.0 or NXDOMAIN (blocked). A real IP address means Pi-hole isn’t intercepting DNS.

Check for WebRTC leaks:

Visit https://browserleaks.com/webrtc — WebRTC can in some configurations bypass VPNs and expose your real IP through your browser. Modern browsers mitigate this with mDNS, but check anyway. If your real IP appears here, disable WebRTC in browser settings (Arkenfox does this automatically).

All four should confirm:

IP → German (Hetzner)
DNS → Quad9 (Swiss)
Ads → Blocked (Pi-hole)
WebRTC → No leak

Check the Pi-hole dashboard:

Open another SSH tunnel:

ssh -L 8080:10.8.1.3:80 your-username@YOUR_VPS_IP

Open http://localhost:8080/admin in your browser (use the password you set in WEBPASSWORD in the Compose file). Browse normally for a minute, then refresh — you should see queries climbing and often 20-40% being blocked (varies by device and usage).

Verify the upstream DNS is correct. Go to Settings → DNS in the Pi-hole dashboard. The only upstream server should be 10.8.1.4#5053 (your dnscrypt-proxy container). If Google or anything else is ticked, untick it. Enter 10.8.1.4#5053 in the Custom DNS field if it’s not already set, and hit Save.

Expected performance:

Ping: 200-300ms (normal for your location → Germany round trip)
Download: close to your raw ISP speed (minus encryption overhead)
A VPN will not increase your speed, but may help if your ISP throttles specific services

Step 10: Add Pi-hole Blocklists (Optional)

Pi-hole comes with a default blocklist. For more comprehensive blocking, add these in the Pi-hole admin dashboard → Adlists:

https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
https://raw.githubusercontent.com/hagezi/dns-blocklists/main/adblock/pro.txt

After adding, go to Tools → Update Gravity to activate them.

Step 11: Harden the VPS (Recommended)

Disable Pi-hole Logging

For consistency with your minimal-logging approach, disable Pi-hole’s query log (it’s separate from system logs):

In the Pi-hole dashboard, go to Settings → Privacy
Set the privacy level to Anonymous mode (highest level)
Hit Save

Then disable the long-term query database:

docker exec pihole bash -c "echo 'MAXDBDAYS=0' >> /etc/pihole/pihole-FTL.conf"
docker restart pihole

Ad blocking still works — Pi-hole doesn’t need logs to block. You just lose the dashboard’s historical stats. If you need to debug a blocked site later, temporarily re-enable logging.

Note on Pi-hole password: Since the dashboard is only accessible via SSH tunnel (already authenticated), a Pi-hole password is optional. To remove it: docker exec pihole pihole setpassword and press Enter twice when prompted.

Reduce System Logging

Prevent persistent logs of network activity on the VPS:

sudo nano /etc/systemd/journald.conf

Add under [Journal]:

Storage=volatile
MaxRetentionSec=1day

Save (Ctrl+O, Enter, Ctrl+X), then restart:

sudo systemctl restart systemd-journald

Storage=volatile keeps logs in RAM only — nothing on disk (though still accessible to a privileged process while the system is running), nothing survives a reboot. MaxRetentionSec=1day discards in-memory logs after 24 hours. This trades forensic visibility for reduced data retention — if an intermittent issue or intrusion occurs, you may have no logs to investigate. This is a deliberate choice, not an oversight.

Automatic Security Updates

Install unattended-upgrades to auto-install security patches daily:

sudo apt install unattended-upgrades
sudo dpkg-reconfigure -plow unattended-upgrades

Select Yes when the dialog appears. Verify it’s running:

sudo systemctl status unattended-upgrades

Reboot gap: unattended-upgrades installs patches but does not reboot the server. Many security fixes — especially kernel updates — only take effect after a reboot. A server can report “up to date” while still running a vulnerable kernel from months ago. Either reboot manually after kernel updates (check with needrestart or cat /var/run/reboot-required), or install needrestart to be alerted when a reboot is needed:

sudo apt install needrestart

Change SSH Port (Optional)

Moving SSH off the default port 22 to a random high port reduces commodity scanning noise (but does not prevent targeted scanning):

sudo nano /etc/ssh/sshd_config

Find the line #Port 22 (or Port 22) and change it to:

Port 48922

Save, then update the firewall before restarting SSH:

sudo ufw allow 48922/tcp
sudo ufw status              # Verify new port is listed
sudo systemctl restart sshd

Test the new port in a separate terminal before closing your current session:

ssh -p 48922 your-username@YOUR_VPS_IP

If that works, remove the old port:

sudo ufw delete allow 22/tcp

After this change, all SSH commands need -p 48922:

# Regular SSH:
ssh -p 48922 your-username@YOUR_VPS_IP

# SSH tunnels for admin panels:
ssh -p 48922 -L 51821:localhost:51821 your-username@YOUR_VPS_IP
ssh -p 48922 -L 8080:10.8.1.3:80 your-username@YOUR_VPS_IP

If you’re using Hetzner’s cloud firewall, add an inbound rule for TCP port 48922.

Install Fail2ban

Protects SSH from brute-force attacks by banning IPs after repeated failed login attempts:

sudo apt install fail2ban
sudo systemctl enable fail2ban
sudo systemctl start fail2ban

Check how many IPs it’s currently blocking:

sudo fail2ban-client status sshd

Switch to SSH Key-Only Authentication (Optional)

If you’re still using password login for SSH, key-based auth is more secure:

sudo nano /etc/ssh/sshd_config

Set:

PasswordAuthentication no
PermitRootLogin no
AllowUsers YOUR_USER

PermitRootLogin no prevents direct root login even if the root account has a password. AllowUsers restricts SSH to only your username — any other system account is locked out entirely. Replace YOUR_USER with your actual username.

Save, then restart:

sudo systemctl restart sshd

Only do this after confirming your SSH key works, or you’ll lock yourself out. Test by opening a second terminal and SSH-ing in before closing your current session.

Split Tunneling (Optional)

By default, ALL traffic routes through Germany. This can cause issues with banking sites and payment apps that flag foreign IPs.

On Android

In the WireGuard app → tap your tunnel → Edit → Excluded Applications → select your banking apps, payment apps (digital wallets, UPI, etc.), and regional streaming apps.

On iPhone

No per-app exclusion available. Toggle VPN off manually for banking, or set up On-Demand rules to auto-disable on your home wifi.

On Laptop

WireGuard’s AllowedIPs is an allow-list, not a deny-list — there is no simple way to exclude specific IP ranges. The common trick of using 0.0.0.0/1, 128.0.0.0/1 still covers the entire IPv4 space (it overrides the default route via more-specific routes, but excludes nothing). For laptops, the practical approach is to toggle the VPN off when you need banking or government portals, then toggle it back on. This is less elegant than Android’s per-app exclusion, but it is the honest answer for WireGuard on desktop.

Syncthing — File Sync Across Devices (Optional)

Syncthing syncs files between your devices peer-to-peer with the VPS acting as an always-on peer (and relay fallback if needed). Useful for keeping your Obsidian vault, research papers, teaching materials, or any folder in sync across your Mac and phone.

Add Syncthing to Docker Compose

In ~/vpn/docker-compose.yml, add this service (same indentation level as the other services):

  syncthing:
    image: syncthing/syncthing:latest
    container_name: syncthing
    environment:
      - PUID=1000
      - PGID=1000
    volumes:
      - ~/syncthing/config:/var/syncthing/config
      - ~/syncthing/data:/var/syncthing/data
    ports:
      - "22000:22000/tcp"
      - "22000:22000/udp"
      - "21027:21027/udp"
    restart: unless-stopped
    networks:
      vpn_net:
        ipv4_address: 10.8.1.5

Create directories and set permissions

Note on port exposure: Unlike other services in this guide, Syncthing’s sync ports (22000, 21027) are bound to all interfaces, not just localhost or the Tailscale IP. This is intentional — Syncthing needs to accept direct connections from your other devices to enable peer-to-peer sync. The web GUI (port 8384) is not exposed and is accessible only via SSH tunnel. The sync protocol itself is encrypted and authenticated; open sync ports do not expose your files.

mkdir -p ~/syncthing/config ~/syncthing/data
sudo chown -R 1000:1000 ~/syncthing

Open firewall ports and launch

sudo ufw allow 22000/tcp
sudo ufw allow 22000/udp
sudo ufw allow 21027/udp
cd ~/vpn
docker compose up -d

Verify:

docker ps | grep syncthing

Should show status Up, not Restarting.

Access the Syncthing Dashboard

Important: All SSH tunnel commands must be run from your local machine’s terminal (Mac/laptop), NOT from inside an existing SSH session to the VPS.

Install Syncthing on Your Mac

brew install syncthing
brew services start syncthing

Or download from https://syncthing.net/downloads/.

Syncthing’s GUI may not always run on port 8384. Find the actual port:

lsof -i -P | grep syncthing

Look for a line with TCP localhost:XXXXX (LISTEN) — that’s the port. Open http://localhost:XXXXX in your browser to access your Mac’s Syncthing dashboard.

Access the VPS Syncthing Dashboard

Since the Mac’s Syncthing may already be using port 8384, use a different local port for the VPS tunnel. Run this from a terminal on your Mac (not inside an SSH session):

ssh -L 8385:10.8.1.5:8384 your-username@YOUR_VPS_IP

Note: This tunnel targets 10.8.1.5 (Syncthing’s IP on the Docker bridge network), not 127.0.0.1 like other SSH tunnels in this guide. That’s because Syncthing’s web GUI (port 8384) is not exposed in the Docker Compose file — it’s only accessible inside the Docker network. SSH on the VPS host can route into Docker bridge networks, so this works, but the access pattern is different from the other admin panels. Open your browser and go to:

http://localhost:8385

You now have two dashboards:

http://localhost:XXXXX — your Mac’s Syncthing (the port you found above)
http://localhost:8385 — your VPS’s Syncthing (via SSH tunnel)

On first launch, the VPS dashboard will prompt you to set a GUI password. Set one — even though it’s behind an SSH tunnel, it’s good practice.

Connect the Devices

Get the VPS Device ID: In the VPS Syncthing dashboard (localhost:8385), go to Actions > Show ID – copy it
Add VPS to Mac: In your Mac’s Syncthing dashboard (localhost:XXXXX), click Add Remote Device > paste the VPS Device ID > Save
Accept on VPS: The VPS dashboard will show a notification to accept the new device – click Add Device > Save
Wait for connection: Both dashboards should show the other device as “Connected” (green)

On the VPS dashboard (localhost:8385), click Add Folder
Folder Label: Obsidian
Folder Path: /var/syncthing/data/obsidian
Click the Sharing tab > tick your Mac
Click the File Versioning tab > select Staggered File Versioning (keeps deleted/changed files for 30 days on the VPS – see below)
Click Save
On the Mac dashboard (localhost:XXXXX), a notification will appear – click Add > set the Folder Path to your existing vault location (e.g., ~/Documents/Obsidian) > Save

The initial sync will copy everything from your Mac to the VPS. Don’t open Obsidian until the sync finishes – you can watch progress on either dashboard.

Enable Staggered File Versioning

Do this on the VPS side for every shared folder before the first sync. Staggered File Versioning keeps old versions of deleted or changed files in a .stversions folder on the VPS with decreasing frequency:

Every version for the first 24 hours
One version per day for the first 30 days
One version per week for the first 6 months
One version per year after that

This means if you accidentally delete a file on your Mac, the deletion syncs to the VPS, but the old version is preserved in .stversions and can be recovered. Set this on the VPS rather than the Mac so backup copies live on the server, not on your laptop.

To enable: click the folder on the VPS dashboard > Edit > File Versioning tab > select Staggered File Versioning > Save.

Syncthing syncs deletions. If you delete a file on your Mac, it’s deleted on the VPS too. Staggered versioning is your safety net – without it, deletions are permanent and immediate.

Prevent Sync Conflicts

In your synced folder, create a file called .stignore to exclude files that change per-device and cause conflicts:

.obsidian/workspace.json
.obsidian/workspace-mobile.json
.trash

Adding More Folders

Each folder you want to sync is added as a separate shared folder in Syncthing. Repeat the same process for each:

VPS dashboard > Add Folder > set path (e.g., /var/syncthing/data/papers) > give it a label > Sharing tab > tick Mac > File Versioning tab > Staggered File Versioning > Save
Mac dashboard > accept the notification > point to your local folder (e.g., ~/Documents/Papers) > Save

Takes about 30 seconds per folder. Keeping folders separate (rather than syncing one parent folder) lets you control which devices get which folders – e.g., Obsidian on your phone, but not teaching materials.

iPhone

There’s no official Syncthing app for iOS. Use Möbius Sync from the App Store (~$5 one-time) – it’s a third-party Syncthing client that works with the same protocol.

Part 2: Tailscale + Storage Box

The VPN handles encrypted browsing. Tailscale handles private access to services. The distinction matters: WireGuard routes your internet traffic through Germany; Tailscale creates a mesh network that lets your devices reach services on the VPS without opening any ports to the public internet. They complement each other.

Tailscale Mesh Network

Install on the VPS

curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up

Follow the authentication URL. Note your Tailscale IP:

tailscale ip -4

Returns something like 100.x.x.x. Services bound to this IP are accessible only from devices on your Tailscale network.

Install on Other Devices

macOS: brew install tailscale or download from https://tailscale.com/download
iPhone/Android: Install from App Store / Play Store
Linux: Same curl command as above

Verify

From your laptop (with Tailscale running):

ping YOUR_TAILSCALE_IP

If it responds, your mesh is working. Any service bound to this IP on the VPS is now accessible from your devices only.

A Note on Access Paths

Services bound to the Tailscale IP are reachable two ways:

Tailscale on — your device connects directly to the VPS via the mesh network. Works regardless of whether WireGuard is active.
WireGuard on — your device routes all traffic through the VPS. Since the Tailscale IP is a local interface on the VPS, requests to it resolve locally on the server. This works even without Tailscale running on your device.

The practical difference: WireGuard routes everything through Germany (browsing, streaming, all traffic). Tailscale connects only to your services. If you’ve toggled WireGuard off — for banking, for government portals, for streaming — Tailscale still gives you access to Kiwix, the search app, Stirling PDF, and everything else on the VPS without rerouting your entire internet connection.

A Note on Tailscale’s Trust Model

Tailscale is not self-hosted. It uses a coordination server operated by Tailscale Inc. (a US company) to manage device identity and key exchange. The actual traffic between your devices is peer-to-peer and encrypted — Tailscale’s servers cannot read the encrypted contents, though relay servers can observe connection metadata — but the coordination server knows which devices are on your network and when they’re online. If your Tailscale account is compromised (e.g., via a compromised Google or Microsoft login), an attacker gains network-level access to every service on your mesh. Mitigation: enable multi-factor authentication on your Tailscale account, and prefer a login provider that supports hardware security keys. For those who want to eliminate this dependency entirely, Headscale is an open-source, self-hosted alternative to Tailscale’s coordination server — but it adds significant operational complexity for marginal benefit at the personal infrastructure scale.

Mount Hetzner Storage Box via SSHFS

The Storage Box provides 1 TB of remote storage. Mounting it via SSHFS makes it appear as a local directory, usable by Docker containers and scripts. This is where the Gutenberg library and other large files live — the VPS’s 40 GB local disk is too small for a 200 GB book collection.

Install and Mount

sudo apt install sshfs
sudo mkdir -p /mnt/storagebox

If you already set up an SSH key for BorgBackup, reuse it. Otherwise:

ssh-keygen -t ed25519 -f ~/.ssh/storagebox -N ""
echo "put ~/.ssh/storagebox.pub .ssh/authorized_keys" | sftp -P 23 uXXXXXX@uXXXXXX.your-storagebox.de

Mount:

sudo sshfs -o allow_other,_netdev,IdentityFile=~/.ssh/storagebox,Port=23 \
  uXXXXXX@uXXXXXX.your-storagebox.de:/ /mnt/storagebox

Auto-Mount on Boot

Add to /etc/fstab:

uXXXXXX@uXXXXXX.your-storagebox.de:/ /mnt/storagebox fuse.sshfs _netdev,allow_other,IdentityFile=/home/YOUR_USER/.ssh/storagebox,Port=23,x-systemd.automount,reconnect 0 0

Test: sudo mount -a

Performance Note

SSHFS adds a network round trip to every file read. For sequential access — streaming a book, downloading a file — this is negligible. For random access — searching a ZIM archive, querying a SQLite database — it’s painfully slow. Rule of thumb: anything that needs fast random reads (databases, search indexes) goes on local VPS disk. Everything else (books, backups, large archives) goes on the Storage Box.

Mount drop warning: If the SSH connection to the Storage Box drops (network interruption, Hetzner maintenance), any process trying to read from /mnt/storagebox will hang — potentially entering uninterruptible I/O wait (D state), which can make the entire VPS feel frozen even if the CPU is idle. The reconnect option in the fstab entry helps, but doesn’t prevent brief hangs during reconnection. If your VPS becomes unresponsive, check the mount first: df -h /mnt/storagebox. If it hangs, the mount is stale — unmount and remount: sudo umount -l /mnt/storagebox && sudo mount -a. Services that depend on the Storage Box (Kiwix) will be unavailable during the interruption; services on local disk (Gutenberg Search, Miniflux, Pi-hole) are unaffected.

BorgBackup — Encrypted Nightly Backups to Hetzner Storage Box (Optional)

BorgBackup sends encrypted, compressed, deduplicated backups of your entire VPS configuration to a Hetzner Storage Box every night. If your VPS dies, you can rebuild everything from the backup.

Prerequisites

A Hetzner Storage Box with SSH enabled in its settings panel
Your Storage Box username (format: uXXXXXX) and hostname (format: uXXXXXX.your-storagebox.de)

Install BorgBackup

sudo apt install borgbackup

Set Up SSH Key Authentication

If you already generated and uploaded a Storage Box key during SSHFS setup above, skip this step — it’s the same key. If not, generate one:

ssh-keygen -t ed25519 -f ~/.ssh/storagebox -N ""

Upload the public key via SFTP (Hetzner doesn’t allow ssh-copy-id on Storage Boxes):

echo "put ~/.ssh/storagebox.pub .ssh/authorized_keys" | sftp -P 23 uXXXXXX@uXXXXXX.your-storagebox.de

Warning: This put command overwrites the authorized_keys file. If you have already uploaded a key (e.g., during SSHFS setup), running this again with a different key will revoke the previous one. If using the same key for both, skip this step entirely.

Enter your Storage Box password when prompted. If the .ssh directory doesn’t exist, connect manually first:

sftp -P 23 uXXXXXX@uXXXXXX.your-storagebox.de
mkdir .ssh
chmod 700 .ssh
exit

Then run the upload command again.

Test the connection:

ssh -i ~/.ssh/storagebox -p 23 uXXXXXX@uXXXXXX.your-storagebox.de

You’ll get a “restricted shell” message — that’s normal. As long as it doesn’t ask for a password, key auth is working.

Initialize the Borg Repository

export BORG_RSH="ssh -i /home/YOUR_USER/.ssh/storagebox"
borg init --encryption=repokey ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./backups

Choose a strong passphrase. Write it down somewhere safe — you need it to restore backups.

Export the Encryption Key

If the Storage Box dies, you lose the repo key and can’t decrypt your backups even with the passphrase. Export a backup of the key:

export BORG_RSH="ssh -i /home/YOUR_USER/.ssh/storagebox"
borg key export ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./backups ~/borg-key-backup.txt
cat ~/borg-key-backup.txt

Save this key file somewhere safe (password manager, printed on paper). You need both the passphrase and this key to restore. Lose either one and your backups are unrecoverable.

Create the Backup Script

nano ~/backup.sh

#!/bin/bash

export BORG_RSH="ssh -i /home/YOUR_USER/.ssh/storagebox"
export BORG_REPO="ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./backups"
export BORG_PASSPHRASE='YOUR_PASSPHRASE_HERE'

# Dump Miniflux database before backup
docker exec miniflux-db pg_dump -U miniflux miniflux > /home/YOUR_USER/miniflux/db-backup.sql

# Create backup
sudo --preserve-env borg create \
    --compression zstd \
    ::vps-{now:%Y-%m-%d-%H%M} \
    /home/YOUR_USER/vpn/docker-compose.yml \
    /home/YOUR_USER/vpn/dnscrypt \
    /home/YOUR_USER/vpn/pihole \
    /home/YOUR_USER/.wg-easy \
    /home/YOUR_USER/syncthing \
    /home/YOUR_USER/miniflux \
    /home/YOUR_USER/kiwix \
    /home/YOUR_USER/gutenberg-search \
    /home/YOUR_USER/stirling-pdf \
    /home/YOUR_USER/.ssh \
    /var/www/hugo \
    /var/www/other \
    /etc/nginx

# Prune old backups: keep 7 daily, 4 weekly, 6 monthly
sudo --preserve-env borg prune \
    --keep-daily 7 \
    --keep-weekly 4 \
    --keep-monthly 6

# Free up space from pruned backups
sudo --preserve-env borg compact

# Fix cache permissions (sudo changes ownership to root)
sudo chown -R YOUR_USER:YOUR_USER /home/YOUR_USER/.cache/borg /home/YOUR_USER/.config/borg

Replace uXXXXXX with your Storage Box username and YOUR_PASSPHRASE_HERE with your passphrase. Use single quotes around the passphrase to prevent bash from interpreting special characters.

Make it executable and restrict permissions (the file contains your passphrase):

chmod 700 ~/backup.sh

Test the Backup

~/backup.sh

First run takes a minute or two. Verify:

export BORG_RSH="ssh -i /home/YOUR_USER/.ssh/storagebox"
export BORG_REPO="ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./backups"
export BORG_PASSPHRASE='YOUR_PASSPHRASE_HERE'
borg list

Should show an archive like vps-2026-02-22-1824.

Automate with Cron

crontab -e

Add:

0 3 * * * /home/YOUR_USER/backup.sh >> /home/YOUR_USER/backup.log 2>&1

This runs the backup every night at 3 AM UTC. Check ~/backup.log if you want to verify it ran.

Additional cron jobs (added after Miniflux and Telegram setup):

# Daily reading digest at 6:00 AM UTC
0 6 * * * ~/miniflux/scripts/run.sh miniflux-telegram-digest.py >> ~/miniflux/scripts/digest.log 2>&1

# Hourly health check — alerts only on problems
0 * * * * ~/miniflux/scripts/run.sh vps-health-monitor.py >> ~/miniflux/scripts/health.log 2>&1

# Daily health summary at 7:00 AM UTC
0 7 * * * ~/miniflux/scripts/run.sh vps-health-monitor.py --daily >> ~/miniflux/scripts/health.log 2>&1

# Weekly log rotation — prevents logs from growing indefinitely
0 0 * * 0 tail -500 ~/miniflux/scripts/health.log > ~/miniflux/scripts/health.log.tmp && mv ~/miniflux/scripts/health.log.tmp ~/miniflux/scripts/health.log
0 0 * * 0 tail -200 ~/miniflux/scripts/digest.log > ~/miniflux/scripts/digest.log.tmp && mv ~/miniflux/scripts/digest.log.tmp ~/miniflux/scripts/digest.log
0 0 * * 0 tail -200 ~/backup.log > ~/backup.log.tmp && mv ~/backup.log.tmp ~/backup.log

See Part 5 for full setup.

What Gets Backed Up

Path	Contents
`docker-compose.yml`	All container configurations
`~/vpn/dnscrypt`	dnscrypt-proxy config
`~/vpn/pihole`	Pi-hole settings and blocklists
`~/.wg-easy`	WireGuard client configs
`~/syncthing`	Syncthing config and synced data
`~/.ssh`	All SSH keys (including Storage Box key)
`/var/www/hugo`	Hugo website files
`/var/www/other`	Other website files
`/etc/nginx`	Nginx configs for both sites
`~/miniflux`	Miniflux docker-compose, scripts, .env, database dump
`~/kiwix`	Kiwix docker-compose (ZIM files live on Storage Box, not backed up here)
`~/gutenberg-search`	Search app source, Dockerfile, docker-compose
`~/stirling-pdf`	Stirling PDF docker-compose

Restoring from Backup

To see what’s in a backup:

borg list ::vps-2026-02-22-1824

To restore everything to a temporary directory:

mkdir ~/restore
cd ~/restore
borg extract ::vps-2026-02-22-1824

To restore a specific file:

borg extract ::vps-2026-02-22-1824 home/YOUR_USER/vpn/docker-compose.yml

Backup Retention

Borg keeps:

Last 7 daily backups
Last 4 weekly backups
Last 6 monthly backups

Older backups are pruned automatically. Deduplication means only changes are stored, so space usage stays small.

Test Your Restores

A backup that has never been tested is a hope, not a plan. Every three months, verify that recovery actually works:

Spin up a temporary VPS (Hetzner bills hourly — a one-hour test costs cents)
Install Borg: sudo apt install borgbackup
Pull the latest archive and extract to a test directory
Confirm configs, scripts, and database dumps are intact
Delete the test VPS

This takes thirty minutes and confirms that your nightly backups are not silently failing, corrupting, or missing critical paths. Add a calendar reminder.

Backup Security

These backups contain SSH keys, the Borg passphrase, API tokens, Docker Compose files, and the complete infrastructure configuration. Anyone with access to a backup archive and the Borg passphrase has the equivalent of root access to your entire infrastructure. Treat backup archives with the same care as your SSH private keys — they are, in effect, a portable copy of your server’s identity.

Maintenance

Managing Containers

Each service runs in its own Docker Compose stack. docker compose commands only affect the stack in the current directory — running docker compose down from ~/vpn stops the VPN stack, not Miniflux or Kiwix.

VPN stack (from ~/vpn):

docker compose down          # Stop VPN stack only
docker compose up -d         # Start VPN stack
docker compose restart       # Restart VPN stack
docker logs wg-easy          # wg-easy logs
docker logs pihole           # Pi-hole logs
docker logs dnscrypt         # dnscrypt-proxy logs
docker logs syncthing        # Syncthing logs

Other stacks — same commands, different directories:

cd ~/miniflux && docker compose down && docker compose up -d
cd ~/kiwix && docker compose down && docker compose up -d
cd ~/gutenberg-search && docker compose down && docker compose up -d
cd ~/stirling-pdf && docker compose down && docker compose up -d

docker logs and docker ps are container-global — they work from any directory.

Updating

cd ~/vpn
docker compose pull          # Pull latest images
docker compose down
docker compose up -d

Repeat for each stack (~/miniflux, ~/kiwix, ~/gutenberg-search, ~/stirling-pdf).

Client configs are preserved in ~/.wg-easy/. Pi-hole settings are preserved in ~/vpn/pihole/. dnscrypt-proxy config is preserved in ~/vpn/dnscrypt/. Syncthing config and data are preserved in ~/syncthing/.

Accessing Admin Panels

All panels require SSH tunnels. Run these commands from a terminal on your local machine (Mac/laptop), NOT from inside an SSH session to the VPS:

# wg-easy (manage VPN clients):
ssh -L 51821:localhost:51821 your-username@YOUR_VPS_IP
# Then open: http://localhost:51821

# Pi-hole (view blocked queries, manage blocklists):
ssh -L 8080:10.8.1.3:80 your-username@YOUR_VPS_IP
# Then open: http://localhost:8080/admin

# Syncthing (manage synced folders and devices):
ssh -L 8385:10.8.1.5:8384 your-username@YOUR_VPS_IP
# Then open: http://localhost:8385

# Miniflux (RSS reader):
ssh -L 8090:127.0.0.1:8090 YOUR_USER@YOUR_VPS_IP -N
# Then open: http://localhost:8090

View Connected Clients

Via the wg-easy web GUI, or from the VPS terminal:

docker exec wg-easy wg show

Troubleshooting

Problem	Fix
“Unauthorized” on wg-easy login	The bcrypt hash was corrupted. Make sure every `$` in the hash is doubled (`$$`) in `docker-compose.yml`. Recreate with `docker compose down && docker compose up -d`
Can’t reach web GUI	Make sure your SSH tunnel is running: `ssh -L 51821:localhost:51821 user@VPS_IP`, then open `http://localhost:51821`
Container stuck in “Restarting”	Check logs: `docker logs wg-easy`, `docker logs pihole`, or `docker logs dnscrypt`
Client connects but no internet	Check `docker ps` — all three containers must be `Up`. Restart with `docker compose restart`
Ads still showing	Some ads (YouTube, Facebook) are served from the same domain as content and can’t be DNS-blocked. Use uBlock Origin in your browser for those
Slow speeds	200-300ms ping is normal for your location → Germany. Download speeds should be close to your ISP speed. Toggle VPN off for latency-sensitive tasks
Banking app blocked	Exclude the app from VPN (Android) or toggle VPN off temporarily (iPhone)
“Handshake did not complete”	Firewall blocking UDP 51820 — check both `ufw` and Hetzner cloud firewall
Container not starting after reboot	Ensure Docker is enabled: `sudo systemctl enable docker`
Can’t SSH after port change	Use `ssh -p 48922 user@VPS_IP`. If locked out, use Hetzner’s web console to fix `/etc/ssh/sshd_config`
Syncthing permission denied crash loop	Run `sudo chown -R 1000:1000 ~/syncthing` then `cd ~/vpn && docker compose restart syncthing`
SSH tunnel “Address already in use”	An old tunnel is still running. Find it with `sudo lsof -i :PORT` and kill the PID. Then retry the tunnel
Mac Syncthing dashboard not on port 8384	Run `lsof -i -P \| grep syncthing` and look for `TCP localhost:XXXXX (LISTEN)` — open that port in your browser
Syncthing “no configuration file provided”	You’re not in the right directory. Run `cd ~/vpn` first, then `docker compose restart syncthing`
Borg “Permission denied” on cache/config	Run `sudo chown -R YOUR_USER:YOUR_USER /home/YOUR_USER/.cache/borg /home/YOUR_USER/.config/borg`
Borg “passphrase is incorrect”	Special characters in passphrase being interpreted by bash. Use single quotes around the passphrase in `backup.sh`
Borg “stale lock” messages	Normal after a failed run. Borg cleans them up automatically on the next run
Pi-hole dashboard shows no queries	Client configs may still use old DNS. Delete and recreate clients in wg-easy, re-scan QR codes
DNS leak test shows Cloudflare	May appear due to shared or proxied infrastructure — verify with `docker logs dnscrypt` that you see `[quad9-doh-ip4-port443-nofilter-ecs-pri] OK (DoH)`. If Quad9 is confirmed in logs, the test result is cosmetic
Pi-hole upstream shows Google	The environment variable didn’t take. Go to Pi-hole dashboard → Settings → DNS → untick Google → enter `10.8.1.4#5053` as Custom DNS → Save

Stirling PDF — Self-Hosted PDF Toolkit

A browser-based PDF toolkit running on your Tailscale mesh. Merge, split, rotate, compress, convert, OCR, watermark, sign, add page numbers, extract images — 50+ operations. Files never leave your server. Replaces every online PDF tool (ILovePDF, SmallPDF, Adobe Acrobat) and the sketchy free ones.

Install

mkdir -p ~/stirling-pdf
nano ~/stirling-pdf/docker-compose.yml

services:
  stirling-pdf:
    image: stirlingtools/stirling-pdf:latest
    container_name: stirling-pdf
    volumes:
      - stirling-data:/configs
      - stirling-tessdata:/usr/share/tessdata
    ports:
      - "127.0.0.1:8484:8080"
      - "YOUR_TAILSCALE_IP:8484:8080"
    environment:
      - SECURITY_ENABLELOGIN=false
    restart: unless-stopped

volumes:
  stirling-data:
  stirling-tessdata:

Replace YOUR_TAILSCALE_IP with your Tailscale IP.

cd ~/stirling-pdf
docker compose up -d

Access: http://YOUR_TAILSCALE_IP:8484

No login required — security is handled by Tailscale (only your devices can reach it). Files are processed in memory and deleted after download.

OCR Languages

English OCR works out of the box. To add other languages (e.g., Hindi, German):

docker exec stirling-pdf bash -c "cd /usr/share/tessdata && \
  wget https://github.com/tesseract-ocr/tessdata/raw/main/hin.traineddata && \
  wget https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata"

API Usage

Stirling PDF exposes a REST API for every operation. Useful for batch processing from scripts:

# Compress a PDF
curl -F 'fileInput=@syllabus.pdf' \
  http://YOUR_TAILSCALE_IP:8484/api/v1/general/compress-pdf \
  -o syllabus-compressed.pdf

# Merge two PDFs
curl -F 'fileInput=@part1.pdf' -F 'fileInput=@part2.pdf' \
  http://YOUR_TAILSCALE_IP:8484/api/v1/general/merge-pdfs \
  -o combined.pdf

Architecture Summary

Your Devices (laptop, phone, tablet)
    │
    ├── WireGuard tunnel (UDP 51820) ──→ VPS ──→ Internet
    │     Encrypted browsing, ad blocking, DNS privacy
    │
    └── Tailscale mesh ──→ VPS services (private access only)
                              │
                              ├── :8888  Kiwix (60,000+ book library)
                              ├── :8585  Gutenberg Search (catalog search + export)
                              ├── :8484  Stirling PDF (PDF toolkit)
                              ├── :8090  Miniflux (RSS reader)
                              └── (SSH tunnel only)
                                   ├── :51821 wg-easy admin
                                   ├── :80    Pi-hole dashboard
                                   └── :8384  Syncthing dashboard

Hetzner VPS — 2 cores, 4 GB RAM, 40 GB SSD
    ├── Docker containers
    │     wg-easy, pihole, dnscrypt, syncthing, miniflux,
    │     miniflux-db, kiwix, gutenberg-search, stirling-pdf
    ├── Cron jobs
    │     BorgBackup (3 AM), Telegram digest (6 AM),
    │     health monitor (hourly + daily summary)
    ├── journald (volatile, 1-day retention)
    ├── fail2ban, unattended-upgrades
    └── Tailscale daemon

Hetzner Storage Box — 1 TB, mounted at /mnt/storagebox via SSHFS
    ├── /kiwix/              30 Gutenberg ZIM files (~200 GB)
    ├── /backups/            BorgBackup archives (encrypted)
    └── (other files)        Personal documents, Zotero, photos

Observability (reduced, not eliminated):
    Your ISP sees:    encrypted UDP packets to a German IP (not their contents)
    Hetzner sees:     encrypted DNS-over-HTTPS leaving the VPS (not query contents)
    Quad9 sees:       DNS queries without your ISP identity, but associated with your VPS IP
    Websites see:     a German Hetzner IP without prior association to your personal ISP identity
    No single ordinary service provider holds the complete picture — but traffic analysis,
                      TLS metadata, and legal orders remain possible

Self-Hosted WireGuard vs Commercial VPN

	Your WireGuard + Pi-hole	Commercial VPN (e.g., ProtonVPN)
Privacy from ISP	Full — ISP sees encrypted UDP to Germany	Full — ISP sees encrypted traffic to VPN server
Privacy from VPN provider	No provider — you control the server	Trust provider’s no-logs policy
Anonymity	None — VPS provider knows your identity, static IP is only yours	Low-Medium — account tied to email/payment, but shared IPs
Ad/tracker blocking	Full — Pi-hole blocks across all apps, custom blocklists	Partial — some offer DNS filtering but less configurable
DNS privacy	Full — Pi-hole → dnscrypt-proxy → Quad9, all encrypted, self-controlled	Provider handles DNS on their servers — you trust them
DNS encryption	Encrypted in transit to resolver — no plaintext DNS in typical operation, barring misconfiguration or fallback conditions	Encrypted within tunnel, but provider resolves on their end
Legal protection	Weak — VPS provider complies with court orders, all traffic is yours	Stronger — shared IPs, no-logs policies, privacy-friendly jurisdictions
Torrenting safety	Risky — static IP, host country copyright enforcement applies	Strong — shared IPs, dedicated P2P servers
Server locations	1 (wherever your VPS is)	60+ countries
Simultaneous devices	Unlimited	Plan-dependent (typically 5-10)
Logging	None — you control and disable all logging	None claimed — depends on provider’s policy and audits
Control	Full — you manage every component	None — provider makes all infrastructure decisions
Reliability	Single server — if VPS goes down, VPN is gone	Redundant infrastructure across hundreds of servers
Kill switch	Manual config required	Built into app
Cost	₹0 additional (runs on existing VPS)	₹300-800/month depending on provider and plan
Setup/maintenance	You manage updates, troubleshooting, Docker containers	Zero maintenance
Best used for	Daily browsing, ad blocking, DNS privacy, self-hosted infrastructure	Torrenting, geo-shifting, backup when VPS is down

Recommendation: Run both. Use your self-hosted WireGuard as the default for daily use (stronger privacy, ad blocking, DNS control, zero cost). Switch to a commercial VPN for torrenting (shared IPs, legal protection) and geo-shifting (multiple countries).

What You Now Have

All VPN-routed traffic encrypted — ISP sees only encrypted UDP to a German IP (subject to split tunneling)
German exit IP — websites see a Hetzner IP, not your ISP
Ads and trackers blocked — Pi-hole blocks often 20-40% of DNS queries across every app (varies by device and usage)
DNS fully encrypted — Pi-hole → dnscrypt-proxy → Quad9 over DNS-over-HTTPS
No publicly exposed web admin interfaces — admin panels closed to internet, accessible only via SSH tunnel
Private mesh network — Tailscale connects all your devices to VPS services
1 TB remote storage — Hetzner Storage Box mounted as local filesystem
60,000+ book library — complete English-language Project Gutenberg via Kiwix
Library search engine — advanced search with reading lists and Zotero/BibTeX export
PDF toolkit — merge, split, compress, OCR, convert, sign — 50+ operations, self-hosted
File sync across devices — Syncthing, no cloud storage needed
100 academic RSS feeds — Miniflux tracks STS, digital media, AI ethics, sociology
Daily Telegram digest — Gemini Flash summarizes new articles as thematic analysis
VPS health monitoring — hourly checks with Telegram alerts on issues
Encrypted nightly backups — BorgBackup to Storage Box
Minimal logging — volatile storage, 1-day retention, nothing on disk
Automatic security updates — unattended-upgrades patches daily
Total cost: ~$9–10/month — ~$5 VPS + ~$4 Storage Box

Part 3: Offline Library

What started as curiosity about Calibre-Web became something more ambitious: a self-hosted, searchable archive of the entire English-language Project Gutenberg catalog, accessible from any device on the Tailscale network. The library runs on two services — Kiwix for reading, and a custom search app for finding and exporting.

Kiwix — Reading Interface

Kiwix serves ZIM files (compressed, indexed web archives) through a browser. It was built to make Wikipedia available offline — it has since been deployed in refugee camps, schools across sub-Saharan Africa, and smuggling operations into North Korea. Here it serves 60,000+ public domain books.

Docker Setup

mkdir -p ~/kiwix
nano ~/kiwix/docker-compose.yml

services:
  kiwix:
    image: ghcr.io/kiwix/kiwix-serve
    container_name: kiwix
    volumes:
      - /mnt/storagebox/kiwix:/data:ro
    ports:
      - "127.0.0.1:8888:8080"
      - "YOUR_TAILSCALE_IP:8888:8080"
    restart: unless-stopped
    command: /data/*.zim

The *.zim glob serves every ZIM file in the directory. Adding files requires a restart to re-expand the glob.

cd ~/kiwix && docker compose up -d

Access: http://YOUR_TAILSCALE_IP:8888

The Collection

The complete English-language Gutenberg catalog, organized by Library of Congress Classification. 30 ZIM files, ~200 GB total, stored on the Storage Box.

LCC	Subject	Size	LCC	Subject	Size
A	General Works	9.1G	N	Fine Arts	21G
B	Philosophy	5.9G	P–PZ	Literature (all sub-codes)	~60G
C	Aux. History	1.2G	Q	Science	16G
D	World History	37G	R	Medicine	1.8G
E	Americas History	9.4G	S	Agriculture	4.2G
F	Americas (Local)	9.1G	T	Technology	12G
G	Geography/Anthro	7.5G	U	Military Science	1.2G
H	Social Sciences	4.2G	V	Naval Science	1.2G
J	Political Science	434M	Z	Bibliography	2.5G
K	Law	233M
L	Education	578M
M	Music	3.7G

All dated December 2025. Updates are infrequent — checking every two to three years is sufficient.

Downloading ZIM Files

Files are downloaded directly to the Storage Box from download.kiwix.org. For bulk downloads, use a script with wget -c (resume-capable) and nohup to survive SSH disconnects:

echo "y" | nohup bash ~/download_gutenberg_zims.sh > ~/gutenberg_download.log 2>&1 &

After downloading, restart Kiwix:

cd ~/kiwix && docker compose restart

Verify: ls -lh /mnt/storagebox/kiwix/gutenberg_*.zim | wc -l (should show 30).

Updating

The community kiwix-zim-updater script checks for newer versions and downloads only updated files:

git clone https://github.com/jojo2357/kiwix-zim-updater.git
./kiwix-zim-updater/kiwix-zim-updater.sh -d /mnt/storagebox/kiwix/

Note: no incremental updates exist for ZIM files. Each update is a full re-download of the changed file.

Gutenberg Search — Discovery Interface

A self-hosted search app that indexes Gutenberg’s catalog in SQLite FTS5 and serves a web UI. Solves a problem Kiwix doesn’t: searching across all 30 ZIM files by author, title, subject, LCC code, and language simultaneously.

What It Does

Full-text search with BM25 relevance ranking
Advanced search: author, title, subject, LCC, language — any combination
“Read in Kiwix” links open books directly in the Kiwix instance
EPUB download links for each book
Export as RIS (for Zotero) or BibTeX (for LaTeX/Overleaf)
Bulk export all results from a search in one click
Named reading lists that persist across sessions
Health check endpoint at /api/health

Install

cd ~
tar -xzf gutenberg-search.tar.gz
cd gutenberg-search
docker compose up -d --build

First startup downloads the catalog (~14 MB) and builds the SQLite index (~1 minute). Check:

curl http://127.0.0.1:8585/api/health
# {"books_indexed":76645,"ready":true,"status":"healthy"}

Access: http://YOUR_TAILSCALE_IP:8585

File Structure

~/gutenberg-search/
├── Dockerfile              # Python 3.12-slim + HEALTHCHECK
├── docker-compose.yml
├── requirements.txt        # flask, gunicorn, flask-cors, flask-limiter
├── app.py                  # Routes, search, reading lists
├── exporters.py            # RIS and BibTeX formatting
└── static/
    └── index.html          # Frontend

/data/ (Docker volume, persistent):
├── pg_catalog.csv          # Cached catalog (auto-downloaded, refreshes after 30 days)
├── catalog.sqlite          # FTS5 index (auto-built)
└── reading_lists.json      # Saved reading lists

Refreshing the Catalog

docker exec gutenberg-search rm /data/pg_catalog.csv /data/catalog.sqlite
cd ~/gutenberg-search && docker compose restart

Note on Book Counts

The search indexes the full multilingual Gutenberg catalog (~76,000 items). The Kiwix ZIM files contain only English-language texts (~60,000). Some search results may not have corresponding books in Kiwix.

Health Monitoring

Both services are checked by the existing vps-health-monitor.py. Three additions were made:

Docker stacks — kiwix, gutenberg-search, and stirling-pdf containers added to DOCKER_STACKS
HTTP health check — queries /api/health on the search app
Checks list — ("Gutenberg", check_gutenberg_search) added to main()

The daily Telegram summary now includes:

kiwix: running
gutenberg-search: running
stirling-pdf: running
Gutenberg Search: healthy (76645 books)

Part 4: Miniflux RSS Reader

A self-hosted RSS reader running alongside your existing Docker Compose VPN stack (WireGuard, Pi-hole, dnscrypt, Syncthing). Miniflux runs as a separate Docker Compose stack in ~/miniflux.

Storage Impact

Component	Size	Notes
Docker images (miniflux + postgres)	~100 MB	One-time
PostgreSQL database (year 1)	200–400 MB	With cleanup policy below
Total realistic	~300–500 MB	After a full year of 100 feeds

The database grows at roughly 1–2 MB/day with 100 feeds. Miniflux’s built-in cleanup keeps it bounded. Negligible on a CX22 with 40 GB.

Step 1: Create the Miniflux directory

mkdir -p ~/miniflux
cd ~/miniflux

Step 2: Generate a strong database password

Use hex encoding to avoid special characters that break the PostgreSQL connection URL:

openssl rand -hex 24

Copy the output. Do NOT use openssl rand -base64 — characters like / and + cause URL parsing errors in the DATABASE_URL.

Step 3: Create docker-compose.yml

nano ~/miniflux/docker-compose.yml

Paste this (replace YOUR_DB_PASSWORD with the password from step 2):

services:
  miniflux:
    image: miniflux/miniflux:latest
    container_name: miniflux
    restart: unless-stopped
    depends_on:
      db:
        condition: service_healthy
    ports:
      - "127.0.0.1:8090:8080"
      - "YOUR_TAILSCALE_IP:8090:8080"
    environment:
      - DATABASE_URL=postgres://miniflux:YOUR_DB_PASSWORD@db/miniflux?sslmode=disable
      - RUN_MIGRATIONS=1
      - CREATE_ADMIN=1
      - ADMIN_USERNAME=YOUR_USERNAME
      - ADMIN_PASSWORD=PICK_A_STRONG_PASSWORD
      - CLEANUP_ARCHIVE_READ_DAYS=120
      - CLEANUP_ARCHIVE_UNREAD_DAYS=140
      - POLLING_FREQUENCY=60
      - BATCH_SIZE=25
      - POLLING_PARSING_ERROR_LIMIT=0
      - METRICS_COLLECTOR=false

  db:
    image: postgres:16-alpine
    container_name: miniflux-db
    restart: unless-stopped
    environment:
      - POSTGRES_USER=miniflux
      - POSTGRES_PASSWORD=YOUR_DB_PASSWORD
      - POSTGRES_DB=miniflux
    volumes:
      - miniflux-db:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "miniflux"]
      interval: 10s
      start_period: 30s

volumes:
  miniflux-db:

Before saving, replace:

YOUR_DB_PASSWORD (appears twice — in DATABASE_URL and POSTGRES_PASSWORD) → the hex password from step 2. Both values must be identical.
PICK_A_STRONG_PASSWORD → your Miniflux login password

What the settings do

Setting	Value	Meaning
`127.0.0.1:8090:8080` + `YOUR_TAILSCALE_IP:8090:8080`	Binds to localhost and Tailscale	Accessible via SSH tunnel or Tailscale mesh
`CLEANUP_ARCHIVE_READ_DAYS=120`	120	Read articles deleted after 120 days
`CLEANUP_ARCHIVE_UNREAD_DAYS=140`	140	Unread articles deleted after ~5 months
`POLLING_FREQUENCY=60`	60 min	Checks each feed every 60 minutes
`BATCH_SIZE=25`	25	Checks 25 feeds per polling cycle
`POLLING_PARSING_ERROR_LIMIT=0`	0	Never stops checking a feed after errors

Step 4: Start Miniflux

cd ~/miniflux
docker compose up -d

Wait ~30 seconds for PostgreSQL to initialize, then check:

docker compose ps

Both miniflux and miniflux-db should show Up (healthy).

Secure the Compose file (it contains your database password and admin credentials):

chmod 600 ~/miniflux/docker-compose.yml

If miniflux shows Restarting, check logs:

docker logs miniflux

Common issues:

“password authentication failed” → the password doesn’t match between the two services
“invalid port … after host” → your password contains special characters. Regenerate with openssl rand -hex 24, then: docker compose down, docker volume rm miniflux_miniflux-db, edit the password, docker compose up -d
“role does not exist” → the db container hasn’t finished initializing. Wait 30 seconds.

Step 5: Access Miniflux via SSH tunnel

From your Mac terminal (not the VPS SSH session):

ssh -L 8090:127.0.0.1:8090 YOUR_USER@YOUR_VPS_IP -N

Leave that running. Open your browser to http://localhost:8090.

Step 6: Disable auto-admin creation

After first login, edit the compose file:

nano ~/miniflux/docker-compose.yml

Change:

      - CREATE_ADMIN=1
      - ADMIN_USERNAME=YOUR_USERNAME
      - ADMIN_PASSWORD=PICK_A_STRONG_PASSWORD

To:

      - CREATE_ADMIN=0

Then: docker compose down && docker compose up -d

Step 7: Import feeds via OPML

The feed list is provided as a separate OPML file (feeds.opml) with 97 feeds across 11 categories. With the 3 Google Scholar alerts from Step 8, the total is 100 feeds.

Transfer the OPML to your Mac:

scp YOUR_USER@YOUR_VPS_IP:~/miniflux/feeds.opml ~/Downloads/feeds.opml

Open Miniflux at http://localhost:8090
Go to Settings → Import → upload the OPML file

All 97 feeds import with their categories intact.

Step 8: Add Google Scholar alerts

Set up 3 alerts separately:

Go to https://scholar.google.com/scholar_alerts
Create alerts (your name, key research terms, co-authors)
In each alert’s settings, choose RSS feed (not email)
Copy the feed URL → add in Miniflux via Feeds → Add Subscription

Step 9: Generate a Miniflux API key

Needed by the Telegram automation scripts.

In Miniflux: Settings → API Keys
Click Create a new API key, name it “scripts”
Copy the key (shown only once)

Step 10: Add to BorgBackup

nano ~/backup.sh

Add a PostgreSQL dump before the borg create command:

docker exec miniflux-db pg_dump -U miniflux miniflux > ~/miniflux/db-backup.sql

Add ~/miniflux to the list of backed-up paths.

Daily Workflow

Accessing Miniflux

ssh -L 8090:127.0.0.1:8090 YOUR_USER@YOUR_VPS_IP -N

Then open http://localhost:8090.

Keyboard shortcuts

Key	Action
`g u`	Go to unread
`g b`	Go to bookmarks
`j` / `k`	Next / previous
`v`	Open original in new tab
`d`	Mark as read/unread
`s`	Star / bookmark
`Shift+A`	Mark all as read
`f`	Toggle full content fetch

Morning routine

Check your Telegram digest (arrives at 6:00 AM UTC — adjust to your timezone)
Open Miniflux to read anything that caught your interest
g u → scan unread, s to star items
Shift+A → mark all read
g b → read starred items

Automation

Two Telegram bot scripts run from ~/miniflux/scripts/:

Daily digest — Gemini Flash summarizes new articles as thematic analysis, sent to Telegram every morning
Health monitor — checks containers, disk, memory, load, backup, SSH failures; alerts on issues hourly, sends daily summary

See the Telegram Automation Setup Guide for configuration.

Maintenance

Updating Miniflux

cd ~/miniflux
docker compose pull
docker compose down
docker compose up -d

Checking disk usage

docker system df
docker exec miniflux-db psql -U miniflux -c "SELECT pg_size_pretty(pg_database_size('miniflux'));"

Adjusting cleanup

Edit docker-compose.yml to tighten retention, then restart:

- CLEANUP_ARCHIVE_READ_DAYS=60      # was 120
- CLEANUP_ARCHIVE_UNREAD_DAYS=90    # was 140

If a feed breaks

Check Feeds view for error counts. Common fixes:

403: set custom user agent in feed settings → Mozilla/5.0 (compatible; Miniflux)
404: URL changed, find current RSS link on journal’s site
Parse errors: try atom variant instead of rss2 or vice versa

Full VPS Service Map

Service	Stack	Container	Port	Access
WireGuard VPN	~/vpn	wg-easy	51820/UDP (public)	WireGuard client
Pi-hole	~/vpn	pihole	localhost:8080	SSH tunnel
DNS encryption	~/vpn	dnscrypt	internal	Via Pi-hole
Syncthing	~/vpn	syncthing	10.8.1.5:8384	SSH tunnel to Docker bridge IP
Kiwix	~/kiwix	kiwix	YOUR_TAILSCALE_IP:8888	Tailscale
Gutenberg Search	~/gutenberg-search	gutenberg-search	YOUR_TAILSCALE_IP:8585	Tailscale
Stirling PDF	~/stirling-pdf	stirling-pdf	YOUR_TAILSCALE_IP:8484	Tailscale
Miniflux	~/miniflux	miniflux	YOUR_TAILSCALE_IP:8090	Tailscale / SSH tunnel
PostgreSQL	~/miniflux	miniflux-db	internal	Via Miniflux

Cron jobs

Job	Schedule	Description
BorgBackup	3:00 AM UTC	Nightly backup to Hetzner Storage Box
RSS digest	6:00 AM UTC	Telegram thematic digest
Health check	Every hour	Alerts only on problems
Health summary	7:00 AM UTC	Daily all-clear report

Part 5: Telegram Automation

Two scripts that use your Miniflux RSS reader and a Telegram bot to keep you informed:

Daily Digest (miniflux-telegram-digest.py) — thematic analysis of new articles via Gemini Flash
VPS Health Monitor (vps-health-monitor.py) — alerts on infrastructure issues

Both use only outbound connections. No ports opened, no domains needed, no new Docker containers.

Architecture

                                                ┌───────────┐
                                   outbound     │  Gemini   │
                              ┌────────────────►│  Flash    │
┌─────────────┐  localhost    │                 │  (free)   │
│   Miniflux   │◄─────────┐  │                 └───────────┘
│   (Docker)   │           │  │
│  100 feeds   │      ┌────┴──┴───────┐  outbound   ┌───────────┐
└─────────────┘      │  Cron scripts  │────────────►│ Telegram  │
                      │                │             │  Bot API  │
┌─────────────┐      │  - digest.py   │             └─────┬─────┘
│  System      │◄────│  - health.py   │                   │
│  (disk/mem/  │      └───────────────┘             ┌─────▼─────┐
│   docker)    │                                    │  Your     │
└─────────────┘                                    │  iPhone   │
                                                    └───────────┘

Prerequisites

Miniflux running with feeds imported (see Miniflux Setup Guide)
Miniflux API key (Settings → API Keys)
Telegram account

Step 1: Create the Telegram Bot

Open Telegram → search for @BotFather → start a chat
Send /newbot
Choose a display name (e.g., “VPS Bot”) and username (must end in bot)
BotFather replies with your bot token — copy it

Step 2: Get Your Chat ID

Open a chat with your new bot and send it any message (e.g., “hi”) — this is required before the bot can message you
Either:
- Search for @userinfobot in Telegram and message it — it replies with your ID
- Or open https://api.telegram.org/bot<YOUR_TOKEN>/getUpdates in a browser and find "chat":{"id":123456789}

Step 3: Get Your Gemini API Key

Go to https://aistudio.google.com/apikey
Click Create API Key
Copy it

Free tier limits change periodically — check current quotas at ai.google.dev. You’ll use 1 request per day, well within any reasonable free tier.

Step 4: Create the Scripts

On your VPS, create the scripts directory and both scripts:

mkdir -p ~/miniflux/scripts

Daily Digest Script

nano ~/miniflux/scripts/miniflux-telegram-digest.py

Paste the full script:

#!/usr/bin/env python3
"""
miniflux-telegram-digest.py
Daily digest: pulls new Miniflux entries, summarizes via LLM, sends via Telegram bot.

Usage: python3 miniflux-telegram-digest.py
Cron:  0 6 * * * cd ~/miniflux && ./scripts/run.sh miniflux-telegram-digest.py

Environment variables (set in ~/miniflux/scripts/.env):
  MINIFLUX_URL          - default http://127.0.0.1:8090
  MINIFLUX_API_KEY      - required
  LLM_PROVIDER          - "claude" | "gemini" | "openai" (default: gemini)
  GEMINI_API_KEY        - if using gemini
  ANTHROPIC_API_KEY     - if using claude
  OPENAI_API_KEY        - if using openai
  DIGEST_DAYS_BACK      - how far back to look (default: 1)
  DIGEST_MAX_ENTRIES    - max entries to summarize (default: 80)
  TELEGRAM_BOT_TOKEN    - from @BotFather
  TELEGRAM_CHAT_ID      - your personal chat ID
"""

import json
import os
import sys
import re
import urllib.request
import urllib.error
from datetime import datetime, timezone, timedelta
from collections import defaultdict

# ── Configuration ────────────────────────────────────────────────────────────
MINIFLUX_URL = os.environ.get("MINIFLUX_URL", "http://127.0.0.1:8090")
MINIFLUX_API_KEY = os.environ.get("MINIFLUX_API_KEY", "")
LLM_PROVIDER = os.environ.get("LLM_PROVIDER", "gemini")
DAYS_BACK = int(os.environ.get("DIGEST_DAYS_BACK", "1"))
MAX_ENTRIES = int(os.environ.get("DIGEST_MAX_ENTRIES", "80"))

TELEGRAM_BOT_TOKEN = os.environ.get("TELEGRAM_BOT_TOKEN", "")
TELEGRAM_CHAT_ID = os.environ.get("TELEGRAM_CHAT_ID", "")

STATE_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), ".digest-state.json")
TG_MAX_LEN = 4000
TARGET_LEN = 10000
# ─────────────────────────────────────────────────────────────────────────────


# ── Miniflux API ─────────────────────────────────────────────────────────────

def miniflux_api(endpoint):
    url = f"{MINIFLUX_URL}/v1{endpoint}"
    req = urllib.request.Request(url)
    req.add_header("X-Auth-Token", MINIFLUX_API_KEY)
    with urllib.request.urlopen(req, timeout=30) as resp:
        return json.loads(resp.read().decode())

def get_new_entries():
    cutoff = datetime.now(timezone.utc) - timedelta(days=DAYS_BACK)
    cutoff_unix = int(cutoff.timestamp())
    entries = []
    offset = 0
    limit = 100

    while len(entries) < MAX_ENTRIES:
        data = miniflux_api(
            f"/entries?order=published_at&direction=desc"
            f"&limit={limit}&offset={offset}"
            f"&after={cutoff_unix}"
        )
        batch = data.get("entries", [])
        if not batch:
            break
        entries.extend(batch)
        offset += limit
        if len(batch) < limit:
            break

    return entries[:MAX_ENTRIES]

def get_categories():
    return {c["id"]: c["title"] for c in miniflux_api("/categories")}

def get_feeds():
    return {f["id"]: {"title": f["title"], "category_id": f["category"]["id"]}
            for f in miniflux_api("/feeds")}


# ── State Management ─────────────────────────────────────────────────────────

def load_state():
    if os.path.exists(STATE_FILE):
        with open(STATE_FILE) as f:
            return json.load(f)
    return {"last_entry_ids": []}

def save_state(entry_ids):
    with open(STATE_FILE, "w") as f:
        json.dump({"last_entry_ids": entry_ids}, f)

def filter_new(entries, state):
    seen = set(state.get("last_entry_ids", []))
    return [e for e in entries if e["id"] not in seen]


# ── LLM Providers ───────────────────────────────────────────────────────────

def strip_html(html):
    text = re.sub(r'<[^>]+>', ' ', html or '')
    text = re.sub(r'\s+', ' ', text).strip()
    return text[:1500]

def build_prompt(entries_by_category, total_count):
    today = datetime.now(timezone.utc).strftime("%A, %B %d, %Y")
    lines = [
        f"You are a research assistant for a scholar specializing in Science and Technology Studies (STS), "
        f"digital infrastructure, algorithms, and emerging technologies. "
        f"Today is {today}. Below are {total_count} new articles from academic RSS feeds, grouped by category.",
        "",
        "Write a THEMATIC ANALYSIS of what's happening in these feeds — not a list of articles. "
        "Structure your analysis as follows:",
        "",
        "OPENING (2-3 sentences): What are the dominant themes or threads across today's articles? "
        "What would an STS scholar find most interesting?",
        "",
        "THEMATIC SECTIONS (3-5 sections): Identify the key themes or conversations emerging "
        "across the articles. Each section should:",
        "  - Have a descriptive thematic header (e.g., 'Algorithmic governance under scrutiny' "
        "    not 'STS Journals')",
        "  - Synthesize what 2-5 articles collectively tell us about that theme",
        "  - Name specific articles and their sources in parentheses when referencing them",
        "  - IMPORTANT: Immediately after mentioning each article, paste its full URL on the next line. "
        "The URL is provided in the data below for each article. This is critical — the reader needs clickable links.",
        "  - Explain why this matters or what's at stake — connect to broader STS debates",
        "  - Be 3-5 sentences long",
        "",
        "QUICK MENTIONS (end): Briefly note any remaining articles that don't fit the themes "
        "above — just title and source, 1 line each.",
        "",
        "STYLE RULES:",
        "- Tone: an informed colleague who reads widely and thinks critically — not a news ticker",
        "- Favor analysis over description: 'These three papers converge on...' not 'This paper is about...'",
        "- Make connections between articles in different categories when relevant",
        "- Plain text only — no markdown, no HTML, no asterisks for bold/italic",
        "- Use line breaks and blank lines between sections for readability",
        f"- Target length: {TARGET_LEN} characters (roughly 1500-2000 words). Use the space.",
        "- Include URLs for articles you discuss in the thematic sections, but NOT for quick mentions",
        "- Format each referenced article as: title (source) followed by its URL on the next line",
        "",
        "---",
        "",
    ]

    for cat_name, entries in entries_by_category.items():
        if not entries:
            continue
        lines.append(f"### Category: {cat_name}")
        for e in entries:
            title = e.get("title", "Untitled")
            url = e.get("url", "")
            feed = e.get("_feed_title", "")
            content = strip_html(e.get("content", ""))
            lines.append(f"\nTitle: {title}")
            lines.append(f"Source: {feed}")
            lines.append(f"URL: {url}")
            if content:
                lines.append(f"Content excerpt: {content[:800]}")
        lines.append("")

    return "\n".join(lines)


def llm_claude(prompt):
    api_key = os.environ.get("ANTHROPIC_API_KEY", "")
    if not api_key:
        raise RuntimeError("ANTHROPIC_API_KEY not set")

    body = json.dumps({
        "model": "claude-haiku-4-5-20251001",
        "max_tokens": 4096,
        "messages": [{"role": "user", "content": prompt}]
    }).encode()

    req = urllib.request.Request(
        "https://api.anthropic.com/v1/messages",
        data=body,
        headers={
            "Content-Type": "application/json",
            "x-api-key": api_key,
            "anthropic-version": "2023-06-01",
        },
        method="POST"
    )
    with urllib.request.urlopen(req, timeout=120) as resp:
        data = json.loads(resp.read().decode())
    return data["content"][0]["text"]


def llm_gemini(prompt):
    api_key = os.environ.get("GEMINI_API_KEY", "")
    if not api_key:
        raise RuntimeError("GEMINI_API_KEY not set")

    body = json.dumps({
        "contents": [{"parts": [{"text": prompt}]}],
        "generationConfig": {"maxOutputTokens": 4096}
    }).encode()

    url = (
        f"https://generativelanguage.googleapis.com/v1beta/models/"
        f"gemini-2.5-flash:generateContent?key={api_key}"
    )
    req = urllib.request.Request(
        url, data=body,
        headers={"Content-Type": "application/json"},
        method="POST"
    )
    with urllib.request.urlopen(req, timeout=120) as resp:
        data = json.loads(resp.read().decode())
    return data["candidates"][0]["content"]["parts"][0]["text"]


def llm_openai(prompt):
    api_key = os.environ.get("OPENAI_API_KEY", "")
    if not api_key:
        raise RuntimeError("OPENAI_API_KEY not set")

    body = json.dumps({
        "model": "gpt-4o-mini",
        "messages": [{"role": "user", "content": prompt}],
        "max_tokens": 4096,
    }).encode()

    req = urllib.request.Request(
        "https://api.openai.com/v1/chat/completions",
        data=body,
        headers={
            "Content-Type": "application/json",
            "Authorization": f"Bearer {api_key}",
        },
        method="POST"
    )
    with urllib.request.urlopen(req, timeout=120) as resp:
        data = json.loads(resp.read().decode())
    return data["choices"][0]["message"]["content"]


LLM_DISPATCH = {
    "claude": llm_claude,
    "gemini": llm_gemini,
    "openai": llm_openai,
}


# ── Telegram ─────────────────────────────────────────────────────────────────

def send_telegram(text):
    """Send message via Telegram Bot API. Splits if over 4096 chars."""
    chunks = []
    while len(text) > TG_MAX_LEN:
        split_at = text.rfind("\n", 0, TG_MAX_LEN)
        if split_at == -1:
            split_at = TG_MAX_LEN
        chunks.append(text[:split_at])
        text = text[split_at:].lstrip("\n")
    chunks.append(text)

    for i, chunk in enumerate(chunks):
        body = json.dumps({
            "chat_id": TELEGRAM_CHAT_ID,
            "text": chunk,
            "disable_web_page_preview": True,
        }).encode()

        url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage"
        req = urllib.request.Request(
            url, data=body,
            headers={"Content-Type": "application/json"},
            method="POST"
        )
        with urllib.request.urlopen(req, timeout=30) as resp:
            result = json.loads(resp.read().decode())
            if not result.get("ok"):
                raise RuntimeError(f"Telegram API error: {result}")

        print(f"Sent message {i+1}/{len(chunks)} ({len(chunk)} chars)")


def send_error_notification(error_msg):
    """Send failure alert via Telegram."""
    try:
        text = (
            f"⚠️ Digest failed\n\n"
            f"Error: {error_msg}\n\n"
            f"Check: tail -50 ~/miniflux/scripts/digest.log"
        )
        send_telegram(text)
    except Exception:
        pass


# ── Main ─────────────────────────────────────────────────────────────────────

def main():
    if not MINIFLUX_API_KEY:
        print("Error: Set MINIFLUX_API_KEY", file=sys.stderr)
        sys.exit(1)

    if not TELEGRAM_BOT_TOKEN or not TELEGRAM_CHAT_ID:
        print("Error: Set TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID", file=sys.stderr)
        sys.exit(1)

    if LLM_PROVIDER not in LLM_DISPATCH:
        print(f"Error: LLM_PROVIDER must be one of: {', '.join(LLM_DISPATCH.keys())}",
              file=sys.stderr)
        sys.exit(1)

    try:
        state = load_state()

        print(f"Fetching entries from last {DAYS_BACK} day(s)...")
        categories = get_categories()
        feeds = get_feeds()
        entries = get_new_entries()
        print(f"Fetched {len(entries)} entries total")

        new_entries = filter_new(entries, state)
        if not new_entries:
            print("No new entries since last run. Skipping.")
            sys.exit(0)

        print(f"{len(new_entries)} new entries to process")

        # Group by category
        entries_by_category = defaultdict(list)
        for entry in new_entries:
            feed_info = feeds.get(entry.get("feed_id"), {})
            cat_id = feed_info.get("category_id", 0)
            cat_name = categories.get(cat_id, "Uncategorized")
            entry["_feed_title"] = feed_info.get("title", "Unknown")
            entries_by_category[cat_name].append(entry)

        # Build prompt and call LLM
        prompt = build_prompt(entries_by_category, len(new_entries))
        print(f"Prompt: {len(prompt)} chars, calling {LLM_PROVIDER}...")

        llm_fn = LLM_DISPATCH[LLM_PROVIDER]
        summary = llm_fn(prompt)
        print(f"Got {len(summary)} char summary")

        # Add footer
        footer = (
            f"\n\n—\n"
            f"{len(new_entries)} articles · {len(entries_by_category)} categories · "
            f"{LLM_PROVIDER} summary"
        )
        full_message = summary + footer

        # Send via Telegram
        send_telegram(full_message)

        # Save state
        all_ids = [e["id"] for e in entries]
        save_state(all_ids)
        print("State saved. Done.")

    except Exception as e:
        print(f"FATAL: {e}", file=sys.stderr)
        import traceback
        traceback.print_exc(file=sys.stderr)
        send_error_notification(str(e))
        sys.exit(1)


if __name__ == "__main__":
    main()

Health Monitor Script

nano ~/miniflux/scripts/vps-health-monitor.py

Paste the full script:

#!/usr/bin/env python3
"""
vps-health-monitor.py
Checks VPS health and sends Telegram alerts on issues.
Runs via cron every hour. Only messages you when something is wrong,
plus an optional daily summary.

Usage: python3 vps-health-monitor.py          # alert-only mode
       python3 vps-health-monitor.py --daily   # daily summary

Environment variables (from ~/miniflux/scripts/.env):
  TELEGRAM_BOT_TOKEN    - required
  TELEGRAM_CHAT_ID      - required

Cron:
  0 * * * * ~/miniflux/scripts/run.sh vps-health-monitor.py >> ~/miniflux/scripts/health.log 2>&1
  0 7 * * * ~/miniflux/scripts/run.sh vps-health-monitor.py --daily >> ~/miniflux/scripts/health.log 2>&1
"""

import json
import os
import sys
import subprocess
import urllib.request
from datetime import datetime, timezone, timedelta
from pathlib import Path

TELEGRAM_BOT_TOKEN = os.environ.get("TELEGRAM_BOT_TOKEN", "")
TELEGRAM_CHAT_ID = os.environ.get("TELEGRAM_CHAT_ID", "")

# ── Thresholds ───────────────────────────────────────────────────────────────
DISK_WARN_PERCENT = 80
DISK_CRIT_PERCENT = 90
MEMORY_WARN_PERCENT = 85
LOAD_WARN_MULTIPLIER = 2.0
BACKUP_MAX_AGE_HOURS = 36
# ─────────────────────────────────────────────────────────────────────────────

# Docker Compose stacks to check: (name, path, expected containers)
DOCKER_STACKS = [
    ("VPN stack", "~/vpn", ["wg-easy", "pihole", "dnscrypt", "syncthing"]),
    ("Miniflux stack", "~/miniflux", ["miniflux", "miniflux-db"]),
    ("Kiwix", "~/kiwix", ["kiwix"]),
    ("Gutenberg Search", "~/gutenberg-search", ["gutenberg-search"]),
    ("Stirling PDF", "~/stirling-pdf", ["stirling-pdf"]),
]


def run(cmd, timeout=10):
    """Run shell command, return stdout or None on failure."""
    try:
        result = subprocess.run(
            cmd, shell=True, capture_output=True, text=True, timeout=timeout
        )
        return result.stdout.strip()
    except Exception:
        return None


def check_docker_containers():
    """Check that expected Docker containers are running."""
    issues = []
    info = []

    running = run("docker ps --format '{{.Names}}'")
    if running is None:
        return ["Could not query Docker — is the daemon running?"], []

    running_set = set(running.split("\n")) if running else set()

    for stack_name, stack_path, expected in DOCKER_STACKS:
        for container in expected:
            if container in running_set:
                info.append(f"{container}: running")
            else:
                issues.append(f"{container} ({stack_name}): NOT RUNNING")

    return issues, info


def check_disk():
    """Check disk usage."""
    issues = []
    info = []

    output = run("df -h / --output=pcent,size,used,avail | tail -1")
    if not output:
        return ["Could not check disk usage"], []

    parts = output.split()
    percent = int(parts[0].replace("%", ""))

    size_output = run("df -h / --output=size,used,avail | tail -1")
    size_parts = size_output.split() if size_output else ["?", "?", "?"]

    info.append(f"Disk: {percent}% used ({size_parts[1]}B / {size_parts[0]}B, {size_parts[2]}B free)")

    if percent >= DISK_CRIT_PERCENT:
        issues.append(f"CRITICAL: Disk at {percent}% — only {size_parts[2]}B free")
    elif percent >= DISK_WARN_PERCENT:
        issues.append(f"WARNING: Disk at {percent}% — {size_parts[2]}B free")

    return issues, info


def check_memory():
    """Check RAM usage."""
    issues = []
    info = []

    output = run("free -m | grep Mem")
    if not output:
        return ["Could not check memory"], []

    parts = output.split()
    total = int(parts[1])
    used = int(parts[2])
    available = int(parts[6])
    percent = round((used / total) * 100)

    info.append(f"Memory: {percent}% used ({used}MB / {total}MB, {available}MB available)")

    if percent >= MEMORY_WARN_PERCENT:
        issues.append(f"WARNING: Memory at {percent}% — {available}MB available")

    return issues, info


def check_load():
    """Check system load average."""
    issues = []
    info = []

    load_str = run("cat /proc/loadavg")
    cpu_str = run("nproc")

    if not load_str or not cpu_str:
        return ["Could not check load"], []

    load_1, load_5, load_15 = [float(x) for x in load_str.split()[:3]]
    cpus = int(cpu_str)

    info.append(f"Load: {load_1:.1f} / {load_5:.1f} / {load_15:.1f} (1/5/15 min, {cpus} cores)")

    if load_5 > cpus * LOAD_WARN_MULTIPLIER:
        issues.append(f"WARNING: Load average {load_5:.1f} exceeds {cpus * LOAD_WARN_MULTIPLIER:.0f} (5 min)")

    return issues, info


def check_backup():
    """Check BorgBackup recency via backup log modification time."""
    issues = []
    info = []

    log_path = os.path.expanduser("~/backup.log")
    if os.path.exists(log_path):
        stat = os.stat(log_path)
        mtime = datetime.fromtimestamp(stat.st_mtime, tz=timezone.utc)
        age = datetime.now(timezone.utc) - mtime
        hours_ago = age.total_seconds() / 3600

        info.append(f"Backup log last modified: {hours_ago:.0f}h ago ({mtime.strftime('%b %d %H:%M UTC')})")

        if hours_ago > BACKUP_MAX_AGE_HOURS:
            issues.append(f"WARNING: Last backup log update was {hours_ago:.0f}h ago (threshold: {BACKUP_MAX_AGE_HOURS}h)")
    else:
        info.append("Backup: no backup.log found — has the backup script ever run?")

    return issues, info


def check_ssh_failures():
    """Check for recent SSH brute force attempts."""
    issues = []
    info = []

    count_str = run("journalctl -u ssh --since '24 hours ago' 2>/dev/null | grep -c 'Failed password' || echo 0")

    if count_str and count_str.isdigit():
        count = int(count_str)
        info.append(f"Failed SSH logins (24h): {count}")
        if count > 100:
            issues.append(f"WARNING: {count} failed SSH attempts in 24h — check fail2ban")
    else:
        count_str = run("grep -c 'Failed password' /var/log/auth.log 2>/dev/null || echo 0")
        if count_str and count_str.isdigit():
            info.append(f"Failed SSH logins (auth.log): {count_str}")

    return issues, info


def check_uptime():
    """Get system uptime."""
    output = run("uptime -p")
    return [], [f"Uptime: {output}"] if output else []


def send_telegram(text):
    """Send message via Telegram Bot API."""
    TG_MAX = 4000
    chunks = []
    while len(text) > TG_MAX:
        split_at = text.rfind("\n", 0, TG_MAX)
        if split_at == -1:
            split_at = TG_MAX
        chunks.append(text[:split_at])
        text = text[split_at:].lstrip("\n")
    chunks.append(text)

    for chunk in chunks:
        body = json.dumps({
            "chat_id": TELEGRAM_CHAT_ID,
            "text": chunk,
            "disable_web_page_preview": True,
        }).encode()

        url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage"
        req = urllib.request.Request(
            url, data=body,
            headers={"Content-Type": "application/json"},
            method="POST"
        )
        with urllib.request.urlopen(req, timeout=30) as resp:
            result = json.loads(resp.read().decode())
            if not result.get("ok"):
                raise RuntimeError(f"Telegram error: {result}")


def main():
    if not TELEGRAM_BOT_TOKEN or not TELEGRAM_CHAT_ID:
        print("Error: Set TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID", file=sys.stderr)
        sys.exit(1)

    daily_mode = "--daily" in sys.argv

    now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
    all_issues = []
    all_info = []

    checks = [
        ("Docker", check_docker_containers),
        ("Disk", check_disk),
        ("Memory", check_memory),
        ("Load", check_load),
        ("Backup", check_backup),
        ("SSH", check_ssh_failures),
        ("Uptime", check_uptime),
    ]

    for name, check_fn in checks:
        try:
            issues, info = check_fn()
            all_issues.extend(issues)
            all_info.extend(info)
        except Exception as e:
            all_issues.append(f"{name} check failed: {e}")

    has_issues = len(all_issues) > 0

    if has_issues:
        lines = [f"🚨 VPS Health Alert — {now}", ""]
        for issue in all_issues:
            lines.append(f"  {issue}")
        lines.append("")
        lines.append("Full status:")
        for info_line in all_info:
            lines.append(f"  {info_line}")
        send_telegram("\n".join(lines))
        print(f"[{now}] ALERT sent: {len(all_issues)} issue(s)")

    elif daily_mode:
        lines = [f"✅ VPS Health — {now}", ""]
        for info_line in all_info:
            lines.append(f"  {info_line}")
        send_telegram("\n".join(lines))
        print(f"[{now}] Daily summary sent: all clear")

    else:
        print(f"[{now}] OK — no issues")


if __name__ == "__main__":
    main()

Make both scripts executable:

chmod +x ~/miniflux/scripts/miniflux-telegram-digest.py
chmod +x ~/miniflux/scripts/vps-health-monitor.py

Step 5: Create the Environment File

On your VPS:

mkdir -p ~/miniflux/scripts
nano ~/miniflux/scripts/.env

# Miniflux
MINIFLUX_URL=http://127.0.0.1:8090
MINIFLUX_API_KEY=your-miniflux-api-key

# LLM
LLM_PROVIDER=gemini
GEMINI_API_KEY=your-gemini-api-key

# Telegram
TELEGRAM_BOT_TOKEN=your-bot-token
TELEGRAM_CHAT_ID=your-chat-id

# Digest settings
DIGEST_DAYS_BACK=1
DIGEST_MAX_ENTRIES=80

Lock it down:

chmod 600 ~/miniflux/scripts/.env

Step 6: Create the Wrapper Script

cat > ~/miniflux/scripts/run.sh << 'EOF'
#!/bin/bash
set -a
source "$(dirname "$0")/.env"
set +a
python3 "$(dirname "$0")/$1"
EOF

chmod +x ~/miniflux/scripts/run.sh

Step 7: Test Both Scripts

Test the digest

~/miniflux/scripts/run.sh miniflux-telegram-digest.py

Expected output:

Fetching entries from last 1 day(s)...
Fetched 80 entries total
80 new entries to process
Prompt: 37132 chars, calling gemini...
Got 8234 char summary
Sent message 1/3 (3842 chars)
Sent message 2/3 (3911 chars)
Sent message 3/3 (1204 chars)
State saved. Done.

You should receive 2-3 Telegram messages with a thematic analysis.

Test the health monitor

~/miniflux/scripts/run.sh vps-health-monitor.py --daily

You should receive a ✅ status summary showing all containers, disk, memory, load, and backup status.

Step 8: Set Up Cron

crontab -e

Add these lines:

# Daily reading digest at 6:00 AM UTC
0 6 * * * ~/miniflux/scripts/run.sh miniflux-telegram-digest.py >> ~/miniflux/scripts/digest.log 2>&1

# Hourly health check — alerts only on problems
0 * * * * ~/miniflux/scripts/run.sh vps-health-monitor.py >> ~/miniflux/scripts/health.log 2>&1

# Daily health summary at 7:00 AM UTC
0 7 * * * ~/miniflux/scripts/run.sh vps-health-monitor.py --daily >> ~/miniflux/scripts/health.log 2>&1

# Weekly log rotation — keep last 1000 lines of health, 500 of digest and backup
0 0 * * 0 tail -500 ~/miniflux/scripts/health.log > ~/miniflux/scripts/health.log.tmp && mv ~/miniflux/scripts/health.log.tmp ~/miniflux/scripts/health.log
0 0 * * 0 tail -200 ~/miniflux/scripts/digest.log > ~/miniflux/scripts/digest.log.tmp && mv ~/miniflux/scripts/digest.log.tmp ~/miniflux/scripts/digest.log
0 0 * * 0 tail -200 ~/backup.log > ~/backup.log.tmp && mv ~/backup.log.tmp ~/backup.log

What You’ll Get

Daily Digest

A 2-3 message thematic analysis, not an article list. Example (hypothetical — the papers, titles, and DOIs below are fabricated to illustrate the format):

Across today's 47 new articles, three threads stand out: a growing
conversation about algorithmic accountability in public institutions,
renewed attention to infrastructure breakdowns in the Global South,
and a methodological debate about ethnographic access in corporate
AI labs.

ALGORITHMIC GOVERNANCE UNDER PRESSURE

Two papers converge on the gap between accountability frameworks
and actual practice. "Auditing Automated Decisions in Welfare"
(Big Data & Society) traces how Dutch municipalities adopted
algorithmic risk scoring while systematically avoiding the oversight
mechanisms meant to accompany it.
https://journals.sagepub.com/doi/full/10.1177/...

This resonates with "The Transparency Trap" (Science, Technology,
& Human Values), which argues that mandated explainability
requirements often produce legibility for regulators rather than
meaningful accountability for affected populations.
https://journals.sagepub.com/doi/full/10.1177/...

...

QUICK MENTIONS
"Viral Misinformation in Marathi-language WhatsApp Groups" (EPW)
"Optimizing Transformer Architectures for Low-Resource NLP" (cs.CL)
"Urban Drone Logistics in Southeast Asia" (Frontiers in Sustainable Cities)

—
47 articles · 8 categories · gemini summary

Health Monitor

Hourly (silent unless problems): No message if everything is fine.

Alert (when something breaks):

🚨 VPS Health Alert — 2026-02-23 14:00 UTC

  miniflux (Miniflux stack): NOT RUNNING
  WARNING: Disk at 82% — 7.2GB free

Full status:
  wg-easy: running
  pihole: running
  dnscrypt: running
  syncthing: running
  miniflux-db: running
  Disk: 82% used (32.8GB / 40GB, 7.2GB free)
  Memory: 61% used (2441MB / 4000MB, 1559MB available)
  Load: 0.3 / 0.2 / 0.1 (1/5/15 min, 2 cores)
  Failed SSH logins (24h): 14
  Uptime: up 42 days, 3 hours, 12 minutes

Daily summary:

✅ VPS Health — 2026-02-23 07:00 UTC

  wg-easy: running
  pihole: running
  dnscrypt: running
  syncthing: running
  miniflux: running
  miniflux-db: running
  Disk: 34% used (13.6GB / 40GB, 26.4GB free)
  Memory: 58% used (2320MB / 4000MB, 1680MB available)
  Load: 0.1 / 0.2 / 0.1 (1/5/15 min, 2 cores)
  Failed SSH logins (24h): 7
  Uptime: up 43 days, 3 hours, 12 minutes

Maintenance

Check logs

tail -30 ~/miniflux/scripts/digest.log
tail -30 ~/miniflux/scripts/health.log

Re-run today’s digest

rm ~/miniflux/scripts/.digest-state.json
~/miniflux/scripts/run.sh miniflux-telegram-digest.py

Change LLM provider

Edit ~/miniflux/scripts/.env:

LLM_PROVIDER=claude
ANTHROPIC_API_KEY=sk-ant-...

No code changes needed. Options: gemini (free), claude (~~$5/year at Haiku pricing), openai (~~$5/year at GPT-4o-mini pricing). Cost estimates assume 1 request/day with a lightweight model — using larger models (Sonnet, GPT-4o) would cost more.

Adjust digest timing

Edit crontab. 0 6 = 6:00 AM UTC — adjust to your timezone.

Test the bot manually

source ~/miniflux/scripts/.env
curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
  -H "Content-Type: application/json" \
  -d "{\"chat_id\": \"${TELEGRAM_CHAT_ID}\", \"text\": \"Test from VPS\"}"

Troubleshooting

Telegram returns 400 “chat not found” — You haven’t messaged the bot yet. Open the bot in Telegram, send any message, then retry.

Telegram returns 401 Unauthorized — Bot token is wrong. Check with @BotFather.

“No new entries since last run” — Normal if feeds haven’t published. To force: rm ~/miniflux/scripts/.digest-state.json

Digest has no links — The LLM sometimes ignores URL instructions. Re-run; it’s usually intermittent.

Health monitor says container not running — Check the container name matches exactly. Run docker ps --format '{{.Names}}' and compare with the DOCKER_STACKS list in vps-health-monitor.py.

Gemini returns 429 — Rate limited (unlikely at 1 req/day). Wait or switch to claude in .env.

Security Notes

No ports opened — all connections are outbound from your VPS
No domain or TLS needed — uses Telegram’s infrastructure for delivery
Credentials on disk — .env is chmod 600, readable only by your user
Telegram bot token — if leaked, someone can send messages as your bot but cannot read your messages or access your VPS. Revoke via @BotFather → /revoke
Miniflux API key — full read/write access to your Miniflux instance, but only used over localhost. If leaked, someone would need VPS access to exploit it
Data passes through third parties — article titles, excerpts, and summaries go to Google (Gemini API) and Telegram. For public academic feeds this is low sensitivity. Avoid adding private or sensitive feeds without considering this
Your VPS IP is visible to Telegram and Google via the outbound API calls
Gemini API key in URL — Google’s API passes the key as a query parameter (?key=...). The connection is HTTPS (encrypted in transit), but the key appears in Google’s server logs associated with your VPS IP. This is Google’s documented API design, not a misconfiguration — but be aware that any HTTP-level debugging or logging you add to the VPS could also capture the key

Files Reference

~/miniflux/scripts/
├── .env                              # API keys and config (chmod 600)
├── run.sh                            # Wrapper that loads .env
├── miniflux-telegram-digest.py       # Daily digest script
├── vps-health-monitor.py             # Health monitor script
├── .digest-state.json                # Tracks processed entries (auto-generated)
├── digest.log                        # Digest cron output
└── health.log                        # Health monitor cron output

Browser Hardening: Your Device, Not Just Your Server

Everything above protects your traffic at the network and server level. But your browser itself leaks data — through cookies, fingerprinting, telemetry, and default search engines. This section addresses the device. No server configuration required; these are changes you make on your laptop and phone.

Switch to Firefox. Chrome is built by Google and integrated into Google’s data infrastructure. Firefox is open-source, maintained by a nonprofit (Mozilla), and designed to be configurable. This switch costs nothing and takes five minutes.

Install uBlock Origin. A browser extension that blocks ads and trackers at the page level — catching what Pi-hole cannot (notably YouTube ads and Facebook sponsored posts). It is free, open-source, and the single most effective privacy tool available in a browser.

Apply Arkenfox settings. Arkenfox is a community-maintained configuration file for Firefox that disables telemetry, hardens privacy defaults, and closes data leaks that Firefox leaves open out of the box. You download one file and place it in your Firefox profile directory. It is not an extension; it is a set of preferences. See: github.com/arkenfox/user.js

Change your default search engine to DuckDuckGo. DuckDuckGo does not track your searches or build a profile of your interests. For specialised academic searching, you will still use Google Scholar or field-specific databases — but your routine searches no longer feed a profile.

Test your setup. The Electronic Frontier Foundation’s “Cover Your Tracks” tool (coveryourtracks.eff.org) analyses your browser’s fingerprint and tracking exposure. Run it before and after these changes to see the difference.

What This Costs

Component	Provider	Monthly cost
VPS (2 cores, 4 GB RAM)	Hetzner Cloud (CX22)	~€4.50 / ~$5
Backup storage (1 TB)	Hetzner Storage Box (BX11)	~€3.80 / ~$4
Domain name (optional)	Any registrar	~$1/month (billed annually)
Total		~$9–10/month

Everything else — Docker, WireGuard, Pi-hole, dnscrypt-proxy, Syncthing, Miniflux, Kiwix, Stirling-PDF, Firefox, uBlock Origin, Arkenfox, Tailscale — is free and open-source software.

For comparison: iCloud (200 GB) is $2.99/month, Squarespace is $16/month, Dropbox Plus is $11.99/month, Adobe Acrobat is $12.99/month. The platform equivalent of this stack runs $40–50/month, with none of the privacy or control benefits.

What This Will Take

Time to build: If you are starting from zero, expect the foundation (Part 1) and Tailscale/storage (Part 2) to take a weekend of focused work. The library (Part 3) takes an afternoon. The RSS reader (Part 4) takes a few hours. Telegram automation (Part 5) takes an evening. Browser hardening takes an hour.

Ongoing maintenance: A few hours per month. Containers occasionally need updating. RSS feeds break when journals change their URLs. Backup logs should be checked periodically. The AI digest bot sometimes needs its prompts adjusted. None of this is urgent or difficult, but it is real. You are committing to an ongoing maintenance relationship with your infrastructure — and that relationship is the point, not a side effect.

Technical skill required: You do not need to be a programmer. You need to be comfortable with a terminal (text-based command line), willing to read documentation, and patient with error messages.

What This Will Not Do

This will not make you anonymous. Your VPS provider knows your identity and billing information. If served with a legal order, they can associate your server with your identity. This is private infrastructure, not clandestine infrastructure.

This will not protect you from a determined state-level adversary. It protects against commercial surveillance, ISP logging, and the ambient data extraction of platform capitalism. If your threat model involves government surveillance, you need additional tools (Tor, Tails) and additional expertise beyond the scope of this guide.

This will not replace collaborative platforms. You still need email, video conferencing, learning management systems, and institutional tools. What changes is the proportion of your digital life that passes through platforms you do not control. The goal is not total exit. It is a reduction in the surface area of platform dependency, and an increase in your understanding of the dependencies that remain.

This requires maintenance. It is not a product you purchase and forget. It is a practice you maintain. If that sounds like a cost, consider: the alternative is paying someone else to maintain it for you, on terms you cannot inspect, with your data as part of the payment.

A Suggested Order of Operations

If you want to start small and build gradually:

Week 1: Rent a VPS. Install Docker. Set up Tailscale. Get comfortable with SSH. (Part 1, Steps 1–2; Part 2, Tailscale section.) Note: the guide builds the VPN stack before Tailscale, but installing Tailscale early gives you a private mesh from the start — useful for accessing services you’ll add later without relying on SSH tunnels.

Week 2: Deploy Pi-hole. This is visible and immediately satisfying — you will see tracking requests being blocked in real time. (Part 1, Steps 3–6.)

Week 3: Add WireGuard. Route your devices through the VPN. Configure split-tunneling exceptions for banking and government portals. (Part 1, Steps 7–9, Split Tunneling.)

Week 4: Deploy Miniflux. Subscribe to 20–30 RSS feeds from journals and blogs in your field. (Part 4.)

Week 5: Add Syncthing. Move your most-used files off iCloud or Google Drive. Set up Stirling-PDF. (Part 1, Syncthing section; Part 2, Stirling PDF section.)

After that: Add dnscrypt-proxy for DNS encryption. Set up automated backups (Part 2, BorgBackup). Configure volatile logging (Part 1, Step 11). Harden your browser. Build the library (Part 3) if you want it. Set up the Telegram digest (Part 5). Each addition takes hours, not days, because the foundation is already in place.

Appendix: Optional Services

Additional self-hosted tools that complement the core infrastructure. Each is a single Docker container on the existing VPS, accessible via Tailscale. Install whichever ones are useful — none depend on each other.

For each service below: add it to DOCKER_STACKS in ~/miniflux/scripts/vps-health-monitor.py, add its directory to the borg create paths in ~/backup.sh, and test with ~/miniflux/scripts/run.sh vps-health-monitor.py --daily.

Uptime Kuma — Status Page & Uptime Monitor

A prettier, more capable alternative to the custom health monitor script. Checks HTTP endpoints, TCP ports, DNS, Docker containers, and sends alerts to Telegram, email, Slack, or Ntfy. Includes a public or private status page.

mkdir -p ~/uptime-kuma
nano ~/uptime-kuma/docker-compose.yml

services:
  uptime-kuma:
    image: louislam/uptime-kuma:latest
    container_name: uptime-kuma
    volumes:
      - uptime-kuma-data:/app/data
    ports:
      - "127.0.0.1:3001:3001"
      - "YOUR_TAILSCALE_IP:3001:3001"
    restart: unless-stopped

volumes:
  uptime-kuma-data:

cd ~/uptime-kuma && docker compose up -d

Access: http://YOUR_TAILSCALE_IP:3001

First visit: create an admin account. Then add monitors for each service:

Monitor Type	Target	Interval
HTTP	`http://127.0.0.1:8585/api/health`	60s
HTTP	`http://127.0.0.1:8888`	60s
HTTP	`http://127.0.0.1:8484`	60s
HTTP	`http://127.0.0.1:8090`	60s
TCP	`127.0.0.1:51820`	60s

Configure Telegram notifications: Settings → Notifications → Add → Telegram → enter your bot token and chat ID (same ones from ~/miniflux/scripts/.env).

Health monitor addition:

("Uptime Kuma", "~/uptime-kuma", ["uptime-kuma"]),

~30 MB RAM. Can coexist with your Python health monitor or eventually replace it.

Docker socket warning: Some Uptime Kuma tutorials recommend mounting /var/run/docker.sock into the container for direct Docker monitoring. Do not do this. Access to the Docker socket is equivalent to root access on the host — a compromised container with socket access can control every other container and the host OS. The HTTP health checks listed above achieve the same monitoring without this risk.

Ntfy — Self-Hosted Push Notifications

Push notifications directly to your phone via your own server. Replaces Telegram as the notification channel if you want to remove that dependency. Works on Android (native app) and iOS (via web push).

mkdir -p ~/ntfy
nano ~/ntfy/docker-compose.yml

services:
  ntfy:
    image: binwiederhier/ntfy:latest
    container_name: ntfy
    command: serve
    volumes:
      - ntfy-cache:/var/cache/ntfy
      - ntfy-data:/etc/ntfy
    ports:
      - "127.0.0.1:2586:80"
      - "YOUR_TAILSCALE_IP:2586:80"
    environment:
      - NTFY_BASE_URL=http://YOUR_TAILSCALE_IP:2586
    restart: unless-stopped

volumes:
  ntfy-cache:
  ntfy-data:

cd ~/ntfy && docker compose up -d

Access: http://YOUR_TAILSCALE_IP:2586

Send a Test Notification

curl -d "Test from VPS" http://YOUR_TAILSCALE_IP:2586/vps-alerts

Install the Ntfy app (Android: Play Store, iOS: App Store). Add a subscription to http://YOUR_TAILSCALE_IP:2586/vps-alerts.

Use in Scripts

Replace Telegram API calls with:

import urllib.request
urllib.request.urlopen(
    urllib.request.Request(
        "http://YOUR_TAILSCALE_IP:2586/vps-alerts",
        data=b"Backup completed successfully",
    )
)

Health monitor addition:

("Ntfy", "~/ntfy", ["ntfy"]),

~10 MB RAM.

Gitea — Self-Hosted Git

Version control for your Hugo sites, scripts, configs, and the Gutenberg search app. A private GitHub without the platform dependency. Lightweight — uses SQLite by default, no separate database needed.

mkdir -p ~/gitea
nano ~/gitea/docker-compose.yml

services:
  gitea:
    image: gitea/gitea:latest
    container_name: gitea
    volumes:
      - gitea-data:/data
    ports:
      - "127.0.0.1:3300:3000"
      - "YOUR_TAILSCALE_IP:3300:3000"
    environment:
      - GITEA__database__DB_TYPE=sqlite3
      - GITEA__server__ROOT_URL=http://YOUR_TAILSCALE_IP:3300/
      - GITEA__server__DOMAIN=YOUR_TAILSCALE_IP
      - GITEA__service__DISABLE_REGISTRATION=true
    restart: unless-stopped

volumes:
  gitea-data:

cd ~/gitea && docker compose up -d

Access: http://YOUR_TAILSCALE_IP:3300

First visit: complete the setup wizard (accept SQLite defaults). Create your admin account. Registration is disabled — you create accounts manually.

Add Your First Repository

On the VPS:

cd ~/gutenberg-search
git init
git add -A
git commit -m "Initial commit"
git remote add origin http://YOUR_TAILSCALE_IP:3300/YOUR_USERNAME/gutenberg-search.git
git push -u origin main

Repeat for ~/kiwix, ~/stirling-pdf, your Hugo source, etc. Now every config change is tracked with history.

Clone on Your Mac

git clone http://YOUR_TAILSCALE_IP:3300/YOUR_USERNAME/gutenberg-search.git

Works from any device on your Tailscale mesh.

Health monitor addition:

("Gitea", "~/gitea", ["gitea"]),

~100 MB RAM with SQLite.

Excalidraw — Self-Hosted Whiteboard

A collaborative whiteboard and diagramming tool. Useful for teaching prep, conference presentation diagrams, research sketches. Saves drawings as JSON files.

mkdir -p ~/excalidraw
nano ~/excalidraw/docker-compose.yml

services:
  excalidraw:
    image: excalidraw/excalidraw:latest
    container_name: excalidraw
    ports:
      - "127.0.0.1:5000:80"
      - "YOUR_TAILSCALE_IP:5000:80"
    restart: unless-stopped

cd ~/excalidraw && docker compose up -d

Access: http://YOUR_TAILSCALE_IP:5000

No account needed — it’s a static app that runs in your browser. Drawings are saved locally in the browser or exported as PNG/SVG. For persistent storage, export drawings and keep them in your Syncthing-synced folder.

Health monitor addition:

("Excalidraw", "~/excalidraw", ["excalidraw"]),

~20 MB RAM.

PrivateBin — Encrypted Pastebin

Share text snippets, code, interview excerpts, or draft paragraphs with collaborators. Everything is encrypted client-side — the server never sees plaintext. Links auto-expire. Replaces Google Docs for quick, disposable sharing.

mkdir -p ~/privatebin
nano ~/privatebin/docker-compose.yml

services:
  privatebin:
    image: privatebin/nginx-fpm-alpine:latest
    container_name: privatebin
    volumes:
      - privatebin-data:/srv/data
    ports:
      - "127.0.0.1:8443:8080"
      - "YOUR_TAILSCALE_IP:8443:8080"
    restart: unless-stopped

volumes:
  privatebin-data:

cd ~/privatebin && docker compose up -d

Access: http://YOUR_TAILSCALE_IP:8443

Paste text, set an expiration (5 minutes to never), optionally set a password, click Send. Share the URL with your collaborator — they need Tailscale access to reach it, which limits sharing to people on your mesh. For external sharing, you’d need to expose it through a reverse proxy with a public domain.

Health monitor addition:

("PrivateBin", "~/privatebin", ["privatebin"]),

~15 MB RAM.

CyberChef — Data Transformation Toolkit

A browser-based toolbox for encoding, decoding, hashing, parsing, formatting, compressing, and hundreds of other data operations. Occasionally indispensable for data cleaning, format conversion, or inspecting encoded text. Run by GCHQ (open source).

mkdir -p ~/cyberchef
nano ~/cyberchef/docker-compose.yml

services:
  cyberchef:
    image: ghcr.io/gchq/cyberchef:latest
    container_name: cyberchef
    ports:
      - "127.0.0.1:8817:80"
      - "YOUR_TAILSCALE_IP:8817:80"
    restart: unless-stopped

cd ~/cyberchef && docker compose up -d

Access: http://YOUR_TAILSCALE_IP:8817

No account, no state — it’s a static web app. Drag operations into the recipe pane, paste input, get output. Everything runs in your browser; the server just hosts the static files.

Health monitor addition:

("CyberChef", "~/cyberchef", ["cyberchef"]),

~10 MB RAM.

Full Port Summary (All Services)

Port	Service	Status
51820/udp	WireGuard	Core
51821/tcp	wg-easy admin	Core (SSH tunnel)
80/tcp	Pi-hole dashboard	Core (SSH tunnel)
8090/tcp	Miniflux	Core (Tailscale)
8888/tcp	Kiwix	Core (Tailscale)
8585/tcp	Gutenberg Search	Core (Tailscale)
8484/tcp	Stirling PDF	Core (Tailscale)
3001/tcp	Uptime Kuma	Optional (Tailscale)
2586/tcp	Ntfy	Optional (Tailscale)
3300/tcp	Gitea	Optional (Tailscale)
5000/tcp	Excalidraw	Optional (Tailscale)
8443/tcp	PrivateBin	Optional (Tailscale)
8817/tcp	CyberChef	Optional (Tailscale)

All optional services combined add ~185 MB RAM. On a 4 GB VPS, this is feasible alongside the core stack but monitor memory usage if you install several.

Self-Hosting for Academics: A Complete Guide to Building Your Own Digital Infrastructure v1.0#

What This Is#

Who This Is For#

How to Use This Guide#

Before You Start: Vocabulary#

A Note on Operating Systems#

Terminal Commands You Will Use Repeatedly#

Where Am I Running This?#

What You Are Replacing, and Why#

Using AI to Help You Build#

What This Is — and What It Is Not#

Threat Model#

Known Weaknesses#

A Note on Redaction#

Disclaimer — Read Before Using This Guide#

Table of Contents#

Part 0: The Website#

How the Site Works#

Local Setup#

Prerequisites#

Project Structure#

Configuration#

Content Conventions#

Shortcodes#

Theme Overrides#

Third-Party Services#

Server Setup#

Deploying#

IMPORTANT: Clear public/ Before Rebuilding After Deletions#

Editing Workflow#

GDPR Compliance#

Part 1: WireGuard VPN + Pi-hole#

Prerequisites#

Step 1: Install Docker (skip if already installed)#

Step 2: Open the WireGuard Port#

Step 3: Generate a Password Hash#

Step 4: Create the Docker Compose File#

Step 5: Create the dnscrypt-proxy Config#

Step 6: Launch Everything#

Step 7: Create Client Configs via SSH Tunnel#

Step 8: Connect Your Devices#

Laptop (macOS / Windows / Linux)#

Phone (Android / iPhone)#

Step 9: Verify Everything Works#

Step 10: Add Pi-hole Blocklists (Optional)#

Step 11: Harden the VPS (Recommended)#

Disable Pi-hole Logging#

Reduce System Logging#

Automatic Security Updates#

Change SSH Port (Optional)#

Install Fail2ban#

Switch to SSH Key-Only Authentication (Optional)#

Split Tunneling (Optional)#

On Android#

On iPhone#

On Laptop#

Syncthing — File Sync Across Devices (Optional)#

Add Syncthing to Docker Compose#

Create directories and set permissions#

Open firewall ports and launch#

Access the Syncthing Dashboard#

Install Syncthing on Your Mac#

Access the VPS Syncthing Dashboard#

Connect the Devices#

Share Your First Folder#

Enable Staggered File Versioning#

Prevent Sync Conflicts#

Adding More Folders#

iPhone#

Part 2: Tailscale + Storage Box#

Tailscale Mesh Network#

Install on the VPS#

Install on Other Devices#

Verify#

A Note on Access Paths#

A Note on Tailscale’s Trust Model#

Mount Hetzner Storage Box via SSHFS#

Install and Mount#

Auto-Mount on Boot#

Performance Note#

Self-Hosting for Academics: A Complete Guide to Building Your Own Digital Infrastructure v1.0

What This Is

Who This Is For

How to Use This Guide

Before You Start: Vocabulary

A Note on Operating Systems

Terminal Commands You Will Use Repeatedly

Where Am I Running This?

What You Are Replacing, and Why

Using AI to Help You Build

What This Is — and What It Is Not

Threat Model

Known Weaknesses

A Note on Redaction

Disclaimer — Read Before Using This Guide

Table of Contents

Part 0: The Website

How the Site Works

Local Setup

Prerequisites

Project Structure

Configuration

Content Conventions

Shortcodes

Theme Overrides

Third-Party Services

Server Setup

Deploying

IMPORTANT: Clear public/ Before Rebuilding After Deletions

Editing Workflow

GDPR Compliance

Part 1: WireGuard VPN + Pi-hole

Prerequisites

Step 1: Install Docker (skip if already installed)

Step 2: Open the WireGuard Port

Step 3: Generate a Password Hash

Step 4: Create the Docker Compose File

Step 5: Create the dnscrypt-proxy Config

Step 6: Launch Everything

Step 7: Create Client Configs via SSH Tunnel

Step 8: Connect Your Devices

Laptop (macOS / Windows / Linux)

Phone (Android / iPhone)

Step 9: Verify Everything Works

Step 10: Add Pi-hole Blocklists (Optional)

Step 11: Harden the VPS (Recommended)

Disable Pi-hole Logging

Reduce System Logging

Automatic Security Updates

Change SSH Port (Optional)

Install Fail2ban

Switch to SSH Key-Only Authentication (Optional)

Split Tunneling (Optional)

On Android

On iPhone

On Laptop

Syncthing — File Sync Across Devices (Optional)

Add Syncthing to Docker Compose

Create directories and set permissions

Open firewall ports and launch

Access the Syncthing Dashboard

Install Syncthing on Your Mac

Access the VPS Syncthing Dashboard

Connect the Devices

Share Your First Folder

Enable Staggered File Versioning

Prevent Sync Conflicts

Adding More Folders

iPhone

Part 2: Tailscale + Storage Box

Tailscale Mesh Network

Install on the VPS

Install on Other Devices

Verify

A Note on Access Paths

A Note on Tailscale’s Trust Model

Mount Hetzner Storage Box via SSHFS

Install and Mount

Auto-Mount on Boot

Performance Note