Self-Hosting for Academics: A Complete Guide to Building Your Own Digital Infrastructure v1.0
N.B: If you come across any errors, or have any suggestions on how to improve this, please don’t hesitate to contact me.
What This Is
This guide walks through the construction of a personal digital infrastructure — step by step, concept by concept — for an academic reader who has no background in systems administration. It assumes you know how to use a computer, install software, and navigate a web browser, but nothing beyond that.
What started as a website on a rented server grew into something larger: a private knowledge infrastructure that handles encrypted browsing, ad blocking, DNS privacy, file synchronisation, scholarly RSS reading, automated research digests, an offline library of 60,000+ public domain books, a search engine for that library, a PDF toolkit, encrypted backups, and health monitoring — all for roughly nine to ten dollars a month.
Each section is modular — you can stop at any layer and still have something useful.
Who This Is For
This guide is written for one person — the person who built it — as a reference for maintaining and extending the infrastructure. It is detailed enough to be reproducible from scratch on a fresh Ubuntu VPS, but it assumes basic comfort with the command line, SSH, and Docker. It does not assume systems administration expertise; the entire thing was built by a qualitative sociologist with the assistance of AI coding tools, not by a professional sysadmin.
If you are an academic, researcher, journalist, or anyone else who wants to reduce platform dependency while maintaining a functional digital workflow, the architecture described here may be useful as a model. The specific tools can be swapped — Miniflux for FreshRSS, Syncthing for Nextcloud, Hugo for Jekyll — but the underlying pattern is consistent: rent a cheap Linux server, run open-source services in Docker containers, connect your devices via a private mesh network, and document everything so you can rebuild it if the server disappears.
How to Use This Guide
The guide is organized chronologically — each part builds on the one before it, roughly in the order things were actually set up. Part 0 covers the website that motivated renting the server in the first place. Parts 1 through 5 add layers of infrastructure on top of it. The Backups section applies to everything. The Appendix describes optional services that can be added independently.
You don’t need to build all of this. Each part is self-contained enough to be useful on its own. A VPS with just a website and a VPN (Parts 0–1) is already a significant improvement over the default platform arrangement. Add services as the need arises, not all at once.
Before You Start: Vocabulary
If you have never administered a server, the terminology can be a barrier. Here is what the key terms mean, in plain language.
Server: A computer that is always on, always connected to the internet, and runs software that responds to requests from other devices. In this guide, “server” means a computer you rent from a company in a data centre — not a machine in your office.
VPS (Virtual Private Server): A slice of a physical server in a data centre that behaves, for your purposes, as if it were your own dedicated computer. You connect to it remotely, install software on it, and it runs 24/7. Think of it as renting a small apartment inside a large building: you have your own keys and walls, but you share the building’s plumbing and electricity.
SSH (Secure Shell): The way you connect to your server. You type commands into a text-based interface (a “terminal”) on your laptop, and those commands execute on the remote server. It looks like a black screen with white text. It is less intimidating than it appears.
Docker: A tool that lets you run software in isolated packages called “containers.” Each container holds one application and everything it needs to function. This means you can run a dozen different services on the same server without them interfering with each other, and if one crashes, the others keep working. Docker is the single most important tool in this guide — it turns complex software installations into one-line commands.
Docker Compose: A file (written in a simple format called YAML) that describes which containers to run and how they should talk to each other. Instead of installing and configuring each service manually, you write one file and Docker sets everything up. The Compose file is, in effect, a blueprint for your entire infrastructure.
DNS (Domain Name System): The system that translates human-readable web addresses (like google.com) into numerical addresses that computers use to find each other. Every time you visit a website, your device first asks a DNS server “what is the numerical address of this domain?” This happens invisibly, dozens or hundreds of times per hour. Whoever handles your DNS queries can see every website you visit.
VPN (Virtual Private Network): A tool that creates an encrypted tunnel between your device and a server somewhere else. All your internet traffic travels through this tunnel, which means anyone watching your local connection (your internet service provider, your university network, a café’s Wi-Fi) sees only encrypted data going to one destination. They cannot directly see which websites you visit or what data you send.
Cron job: A scheduled task that runs automatically at a set time — like an alarm clock for your server. For example, “run the backup script at 3 AM every night.” You set it once and it runs without intervention.
Port: A numbered channel on a server through which a specific service communicates. Think of it like apartment numbers in a building: the server is the building, and each service answers at a different port number.
A Note on Operating Systems
This guide assumes you are working from a Mac. All local commands use macOS tools (brew, Terminal.app, macOS file paths). If you use Linux, the commands are largely identical. If you use Windows, you will need WSL (Windows Subsystem for Linux) or PuTTY for SSH, and some local commands will differ — the guide does not cover these differences.
The VPS itself runs Ubuntu Linux. All commands that begin with ssh are executed on the remote server, regardless of your local operating system.
Terminal Commands You Will Use Repeatedly
If you have never used a terminal, here is what the recurring commands do:
cd ~/directory — change into a directory. ~ means your home folder. cd ~/vpn means “go to the vpn folder in my home directory.”
nano filename — open a file for editing in a simple text editor. Save with Ctrl+O, exit with Ctrl+X.
cat filename — print the contents of a file to the screen. Useful for checking what’s inside a config file before changing it.
ls — list the files in the current directory. ls -la shows hidden files and permissions.
mkdir -p ~/directory — create a directory (and any parent directories that don’t exist yet).
cp source destination — copy a file. mv source destination — move or rename a file.
sudo command — run a command as the system administrator. Required for installing software, editing system files, and managing services. You will be prompted for your password.
docker compose up -d — start all containers defined in the current directory’s docker-compose.yml. The -d flag runs them in the background.
docker compose down — stop all containers in the current stack.
docker ps — list all running containers. docker logs containername — show a container’s recent output.
ssh user@ip — connect to a remote server. This is how you access your VPS from your laptop.
scp file user@ip:path — copy a file from your laptop to the server (or vice versa).
chmod 600 file — restrict a file’s permissions so only you can read it. Used for secrets and keys.
These ten commands account for roughly 90% of what this guide asks you to do. Everything else is explained in context.
Where Am I Running This?
This guide constantly switches between two machines: your laptop (the local machine) and the VPS (the remote server). If you lose track of which one you’re on, things will break or fail silently. Here is how to tell.
On your VPS (after running ssh YOUR_USER@YOUR_VPS_IP):
- Your terminal prompt will show the VPS hostname (e.g.,
YOUR_USER@vps:~$) - This is where you create Docker Compose files, launch containers, edit configs, run health checks, and manage backups
- Almost everything in Parts 1–5 happens here
- Type
exitto disconnect and return to your laptop
On your laptop (your local terminal, no SSH):
- Your terminal prompt will show your Mac’s name (e.g.,
yourname@your-mac:~$) - This is where you edit your Hugo site, run
deploy.sh, generate SSH keys, open SSH tunnels, and install local tools like Hugo and Tailscale - Part 0 (the website) happens entirely here
- SSH tunnels (e.g.,
ssh -L 8090:127.0.0.1:8090 YOUR_USER@YOUR_VPS_IP -N) are run from here — they connect your local browser to a service on the VPS
The rule of thumb: if the command starts with ssh, you are about to go to the VPS or creating a tunnel to it. If you are already inside an SSH session and the command starts with docker, nano, sudo, or cd ~/, you are working on the VPS. If you see hugo, brew, deploy.sh, or references to your local file paths (e.g., ~/academic-site/), you are on your laptop.
When in doubt, run hostname — it prints the name of the machine you are currently on.
What You Are Replacing, and Why
Here is what many academics pay for monthly, often without thinking about it:
Cloud storage (iCloud, Google Drive, Dropbox): $5–15/month. These services sync your files across devices by uploading them to the company’s servers. The company can read your files, scan them, and change the terms of service at any time. You are paying for the convenience of not running your own sync tool.
Website hosting (Squarespace, WordPress.com, Wix): $10–20/month. These services host your academic website on their infrastructure. You cannot inspect how your site is served, what data is collected about visitors, or move your content easily if prices change.
RSS / news curation (social media, email newsletters): $0 in money, but you pay in attention and data. Algorithmic feeds decide what scholarship you see, when you see it, and in what order. You have no control over the ranking logic.
PDF tools (online converters, Adobe Acrobat): $0–15/month. Every time you upload a document to an online PDF tool, you are sending your work to a stranger’s server.
Ad-blocking (browser extensions only): $0, but incomplete. Browser-level ad blockers only work inside the browser. They do not block tracking by apps on your phone or by the operating system itself.
The infrastructure described below replaces all of these with tools you control, running on a server you rent, for roughly $9–10/month total.
Using AI to Help You Build
You do not need to be a programmer to follow this guide. If you have ever customised a LaTeX template, debugged a reference manager, or configured a course on an LMS, you have the disposition required. The specific skills can be learned as you go.
This infrastructure was built with the help of an AI coding assistant, and you may do the same. Describing what you want to an LLM (Claude, ChatGPT, or similar) and iteratively revising the code it produces is a viable method for setting up Docker containers, writing configuration files, and troubleshooting errors. The important thing is to audit what the AI produces — read the configuration, understand what each line does, and cross-reference against official documentation. The AI handles syntax; you handle intent and verification.
What This Is — and What It Is Not
This guide describes personal privacy infrastructure and personal research infrastructure. These are related but distinct things, and conflating them leads to overclaiming.
Privacy infrastructure reduces the number of third parties who can observe your digital activity. The VPN, Pi-hole, and dnscrypt-proxy encrypt and filter your traffic so that your ISP, ad networks, and casual observers see less. This is real and measurable — but it is reduction, not elimination.
Research infrastructure provides self-controlled tools for scholarly work. The RSS reader, the Gutenberg library, the search app, the PDF toolkit, the file sync — these are workflow tools that happen to run on your own hardware instead of someone else’s. Their value is independence from platform lock-in, data sovereignty over your own materials, and the capacity to inspect and modify every layer of the stack. This is the dimension that has no commercial equivalent: no platform sells you the ability to understand and meaningfully reconfigure your own infrastructure as an integrated system.
Anonymity infrastructure is something this guide does not build. Anonymity means an adversary cannot determine your identity even with access to the traffic. This requires Tor, multi-hop routing, careful operational security, and behavioral discipline that goes far beyond what is described here. Your VPS has a static IP registered to your name. Your VPS provider can associate all traffic with your billing identity. You are private but not anonymous — and the distinction matters.
Security infrastructure at the enterprise level involves network segmentation, intrusion detection systems, centralized log aggregation, key management services, multi-factor authentication on every layer, regular penetration testing, and dedicated security teams. This guide does none of that. It runs a dozen Docker containers on a single $5 VPS with fail2ban and a firewall. The attack surface is small because the infrastructure is small — one user, one server, one purpose. If this were a production system serving paying customers, the security posture described here would be inadequate. For a personal research stack accessed over a private mesh network, it is proportionate.
Threat Model
Being explicit about what this infrastructure protects against — and what it does not — prevents the guide from making promises it cannot keep.
What it protects against:
- Your ISP logging which websites you visit (VPN encrypts all traffic)
- Advertising networks tracking you across apps and devices (Pi-hole blocks at DNS level)
- DNS queries being readable by intermediaries (dnscrypt-proxy encrypts them in transit to the resolver)
- Platform lock-in and unilateral terms-of-service changes (self-hosted, portable stack)
- Data loss from provider shutdown (encrypted backups, documented configs)
- Casual surveillance of your scholarly reading habits, search patterns, and file contents
What it does not protect against:
- Your VPS provider (Hetzner knows your identity and can comply with German court orders)
- Traffic analysis (an observer can see encrypted packet timing and volume, even without reading content)
- TLS metadata leakage (Server Name Indication exposes which domains you visit unless ECH — Encrypted Client Hello — is enabled, which is not yet widely deployed)
- Compromise of the VPS itself (if the server is breached, everything on it is exposed)
- Compromise of your Tailscale account (this grants network access to all services)
- A determined state-level adversary with the resources to correlate traffic patterns across providers
What it explicitly does not attempt:
- Anonymity (your IP is static and registered to you)
- Anti-forensics (volatile logging helps, but a motivated adversary with host access can still examine running processes and memory)
- High-risk activism support (journalists, dissidents, and whistleblowers need Tor, Tails, and operational security practices that are beyond the scope of this guide)
Known Weaknesses
No infrastructure guide should pretend its subject has no flaws. These are the ones that matter:
Single point of failure. Everything runs on one VPS. If that server goes down, every service goes down simultaneously. Enterprise architecture addresses this with redundancy, failover, and multi-region deployment. For personal infrastructure, the mitigation is simpler: encrypted backups on a separate storage box, documented configurations that can be redeployed on any provider within an hour, and the acceptance that occasional downtime is tolerable for a stack that serves one person.
Docker image trust. The guide uses latest tags for Docker images, which means every docker compose pull could introduce changes you haven’t reviewed. A malicious or broken update to any upstream image could compromise the service. The enterprise practice is to pin specific image versions and update deliberately after testing. For personal use, the practical recommendation is: pin versions for critical services (VPN, Pi-hole, backup tools) and accept latest for low-risk services (Stirling PDF, Excalidraw). Always check changelogs before pulling updates.
Privileged containers. The wg-easy container runs with NET_ADMIN and SYS_MODULE capabilities because WireGuard requires kernel-level network access. A container escape from this container would grant host-level privileges. There is no mitigation short of running WireGuard outside Docker entirely (which adds different complexity). This is a known trade-off of Docker-based VPN setups.
Secrets in scripts. The backup script contains the Borg passphrase in plaintext. The .env file contains API keys. Both are protected by file permissions (chmod 600 / chmod 700), but they exist on disk as readable text. Enterprise infrastructure uses dedicated secrets managers (HashiCorp Vault, AWS Secrets Manager). For personal use, strict file permissions and awareness of the risk are the proportionate response — but never commit these files to a Git repository.
No restore testing. Backups exist and run nightly, but unless you periodically test the restore process, you cannot be certain they work. Add a calendar reminder: every three months, spin up a test environment and restore from the latest Borg archive. A backup that has never been restored is a hope, not a plan.
Docker can bypass UFW. This is a well-documented and widely misunderstood interaction: Docker manipulates iptables directly, adding its own FORWARD rules that bypass UFW’s INPUT chain. This means a port exposed in a Docker Compose file may be reachable from the internet even if UFW has no rule allowing it. The mitigation used throughout this guide is to bind every service to either 127.0.0.1 (localhost only) or YOUR_TAILSCALE_IP (mesh only), so Docker never exposes a port on all interfaces. This is more reliable than relying on UFW to block Docker-exposed ports. If you add new services, always specify the bind address explicitly in the ports directive — never use bare "8080:8080" without an IP prefix.
Environment variable exposure. The run.sh wrapper uses set -a to export all variables from .env into the Python process’s environment. This means API keys and tokens are visible in /proc/<pid>/environ to any process running as the same user. On a single-user VPS this is the expected threat surface, but be aware that a compromised process running as your user can read all secrets from any other process’s environment.
A Note on Redaction
All IPs, domain names, API endpoints, onion addresses, and personally identifying details have been scrubbed from this guide and replaced with placeholder variables (e.g., YOUR_VPS_IP, YOUR_TAILSCALE_IP, YOUR_DOMAIN, YOUR_USER). Part 0 (the website) is intentionally less detailed than other sections because it describes the only public-facing component of the infrastructure. The guide is safe to store, share, or publish — but the actual configuration files on the server contain the real values and should be treated accordingly.
Disclaimer — Read Before Using This Guide
This document describes a personal, single-user infrastructure built for a specific and limited threat model. It is provided for educational and informational purposes only. It is not a production-ready system, not a comprehensive security framework, and not suitable for high-risk contexts (including journalism, activism, or any setting requiring anonymity or adversary-resistant operational security).
Do not copy or deploy this guide verbatim. Many components require adaptation to your environment, careful configuration, and ongoing maintenance. Misconfiguration — especially of networking, Docker port bindings, authentication, or firewall rules — can expose services to the public internet and result in data loss or system compromise.
This setup prioritizes accessibility and independence over hardening. It does not protect against a compromised VPS, account takeover (e.g., Tailscale), traffic analysis, or a determined adversary. Secrets may be stored locally (e.g., in environment files or scripts), and security practices described here are appropriate only for a single-user system under a modest threat model. You are solely responsible for any system you build using this guide. Before deployment, you should understand each component, review official documentation, pin versions for critical services, and implement additional safeguards appropriate to your use case.
Table of Contents
- Part 0: The Website — Hugo, PaperMod, Nginx, Let’s Encrypt, deploy script
- Part 1: VPN + Ad Blocking — WireGuard, Pi-hole, dnscrypt-proxy, Syncthing
- Part 2: Tailscale + Storage Box — Private mesh network, SSHFS mount, Stirling PDF
- Part 3: Offline Library — Kiwix, Project Gutenberg collection, search app
- Part 4: RSS Reader — Miniflux with 100 academic feeds
- Part 5: Telegram Automation — Daily digest and VPS health monitor
- Backups — BorgBackup to Hetzner Storage Box
- Appendix — Optional services (Uptime Kuma, Ntfy, Gitea, Excalidraw, PrivateBin, CyberChef)
Part 0: The Website
Note: This section is intentionally less specific than the rest of the guide. The website is the only public-facing component of the infrastructure, and detailed server configurations, form endpoints, and domain-specific settings are redacted to avoid creating unnecessary exposure. The workflow and architecture are described fully; the implementation details are kept private.
Everything started here. The VPS was rented to host a personal academic website — a static site built with Hugo, themed with PaperMod, edited locally in Obsidian, and deployed via rsync. Every other service in this guide grew from the fact that the server already existed.
How the Site Works
The site is a static HTML site generated by Hugo, a fast open-source static site generator. Content is written in Markdown with TOML frontmatter (+++ delimiters), organized into pages and blog posts. The PaperMod theme provides the layout, dark mode, reading time, breadcrumbs, and responsive design. Hugo compiles everything into a public/ directory of plain HTML, CSS, and assets — no database, no PHP, no server-side processing.
Nginx serves the static files on the VPS. Let’s Encrypt provides HTTPS certificates, auto-renewed by Certbot. A deploy script builds the site locally and rsyncs the output to the server.
Local Setup
Prerequisites
Install Hugo on your Mac:
brew install hugo
Project Structure
mysite/
├── hugo.toml # Site configuration
├── content/
│ ├── _index.md # Home page
│ ├── research.md # Research page
│ ├── teaching.md # Teaching page
│ ├── contact.md # Contact form
│ ├── privacy.md # Privacy policy
│ └── posts/
│ ├── _index.md # Blog index with search + subscribe
│ └── *.md # Blog posts
├── layouts/
│ ├── partials/
│ │ ├── footer.html # Footer override (custom links)
│ │ ├── extend_head.html # Empty (analytics removed)
│ │ └── extend_footer.html # Empty
│ └── shortcodes/
│ ├── rawHTML.html # Allows raw HTML in Markdown
│ ├── news-subscribe.html # Email signup form
│ └── postsearch.html # Client-side blog search
├── static/
│ ├── css/custom.css # Custom styles
│ ├── files/ # PDFs (syllabi, papers)
│ └── media/ # Images
├── themes/
│ └── PaperMod/ # Theme
├── deploy.sh # Build + rsync to VPS
└── public/ # Generated output (not committed)
Configuration
The site is configured in hugo.toml. Key settings:
baseURL = "https://YOUR_DOMAIN/"
theme = "PaperMod"
enableRobotsTXT = true
[markup.goldmark.renderer]
unsafe = true # Required for raw HTML in Markdown
[params]
defaultTheme = "dark"
disableThemeToggle = true
customCSS = ["css/custom.css"]
ShowReadingTime = true
ShowBreadCrumbs = true
The unsafe = true setting allows raw HTML inside Markdown files — needed for contact forms, collapsible sections, and inline styling.
Content Conventions
Pages use TOML frontmatter with +++ delimiters (not YAML ---):
+++
title = "Page Title"
draft = false
showDate = false
showReadingTime = false
showWordCount = false
type = "page"
layout = "page"
+++
Blog posts add date and optional tags:
+++
title = "Post Title"
date = 2025-09-03
draft = false
hiddenInHomeList = true
tags = ["tag1", "tag2"]
+++
Shortcodes
Three custom shortcodes in layouts/shortcodes/:
- rawHTML.html — wraps raw HTML so Hugo doesn’t escape it. Used for forms and custom layouts.
- news-subscribe.html — email subscription form powered by a third-party newsletter service. Takes optional
tagandsuccessparameters. - postsearch.html — client-side blog search. Fetches a JSON index generated by Hugo and searches titles, tags, and summaries with debounced input. Press
/to focus.
Theme Overrides
Three files in layouts/partials/ override PaperMod defaults:
- footer.html — copied from the theme and modified to add custom links (Tor mirror, privacy page) to the site footer.
- extend_head.html — empty. Previously contained analytics; removed for GDPR compliance. The file must exist as an empty override — deleting it would cause Hugo to fall back to the theme’s default, which may not be empty. Future
<head>additions go here. - extend_footer.html — empty. Same logic: exists as an intentional override to prevent the theme from injecting unwanted content. Available for future footer additions.
Third-Party Services
The site uses two external services for form handling and email subscriptions. Both are US-based and identified in the site’s privacy policy with links to their respective privacy policies. No analytics, no cookies, no tracking scripts.
Server Setup
Nginx serves static files from the webroot with HTTPS via Let’s Encrypt. The configuration includes an Onion-Location header so Tor Browser users are prompted to switch to the .onion mirror. Certbot handles certificate issuance and auto-renewal.
# Install
sudo apt install nginx certbot python3-certbot-nginx
# Get certificates
sudo certbot --nginx -d YOUR_DOMAIN -d www.YOUR_DOMAIN
# Create webroot
sudo mkdir -p /var/www/html
sudo chown -R $USER:$USER /var/www/html
Certbot modifies the Nginx config automatically and sets up auto-renewal via a systemd timer.
Deploying
A deploy script builds the site locally and rsyncs the output to the VPS:
#!/usr/bin/env bash
set -euo pipefail
REMOTE="YOUR_USER@YOUR_VPS_IP"
WEBROOT="/var/www/html"
hugo --minify --environment production
rsync -azv --delete --progress \
-e "ssh" \
public/ "${REMOTE}:${WEBROOT}/"
The --delete flag removes files on the server that no longer exist locally — important when deleting a page or removing a script. Without it, stale files persist in the webroot.
IMPORTANT: Clear public/ Before Rebuilding After Deletions
Hugo doesn’t always clean up deleted files from public/. If you remove a tracking script or delete a page, the old output may persist:
rm -rf public/
hugo --minify --environment production
./deploy.sh
Editing Workflow
- Edit content files in Obsidian (or any text editor) — they’re plain Markdown
- Preview locally:
hugo server -D(the-Dflag includes drafts) - Open
http://localhost:1313to see the live preview - When satisfied, run
./deploy.sh - Site is live within seconds
All editing happens on the local machine. The VPS is never edited directly — it only receives the built output via rsync.
GDPR Compliance
The site was cleaned up for GDPR compliance:
- Analytics removed — the
extend_head.htmlpartial was emptied. Thepublic/directory was cleared and rebuilt to ensure no stale tracking scripts remained in the deployed output. - Privacy page created —
content/privacy.mdidentifies both third-party form/newsletter services as US-based data processors with links to their privacy policies. States that no analytics or cookies are used and that server logs are volatile. - Privacy link in footer —
layouts/partials/footer.htmloverrides the theme footer to add a “Privacy” link.
A second site managed on the same VPS required no GDPR action — it has no analytics, no forms, and no third-party services.
Part 1: WireGuard VPN + Pi-hole
This is the privacy layer — the foundation of the entire infrastructure. It has three components that work together. The VPN wraps all your internet traffic in an encrypted tunnel. The ad-blocker intercepts unwanted tracking requests inside that tunnel. The DNS encryption ensures that even your domain lookups are private. Think of it as three concentric walls.
Why this matters for academics: If you work on politically sensitive topics, access paywalled resources from insecure networks, or simply prefer that your ISP not have a direct log of the specific sites you visit, this layer provides meaningful protection. The Pi-hole dashboard will also show you, in real time, every domain your devices are trying to reach — which is itself an education in how pervasive commercial surveillance infrastructure actually is.
A complete guide to setting up a private VPN tunnel with DNS-level ad/tracker blocking through your Hetzner Germany VPS, using wg-easy (WireGuard with a web GUI) and Pi-hole.
What you get at the end:
- All VPN-routed traffic encrypted through Germany (subject to split tunneling configuration)
- Ads and trackers blocked across all apps (not just browsers)
- DNS queries resolved on your own server, forwarded to Quad9 (Swiss-based non-profit DNS provider) over encrypted DNS-over-HTTPS — no plaintext DNS in typical operation, barring misconfiguration or fallback conditions
- Admin panels accessible only via SSH tunnel — no publicly exposed web admin interfaces
- Runs on your existing VPS at zero additional cost
Prerequisites
- Hetzner VPS running Ubuntu 24 (or similar Debian-based distro)
- SSH access to the VPS
- Docker and Docker Compose installed (see Step 1 if not)
Step 1: Install Docker (skip if already installed)
SSH into your VPS:
ssh your-username@YOUR_VPS_IP
Install Docker:
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
Log out and back in for the group change to take effect:
exit
ssh your-username@YOUR_VPS_IP
Install Docker Compose and verify:
sudo apt install docker-compose-plugin
docker --version
docker compose version
Step 2: Open the WireGuard Port
WireGuard uses UDP port 51820. Open only this port — the admin panels stay closed and are accessed securely via SSH tunnel instead.
sudo ufw allow OpenSSH
sudo ufw allow 51820/udp
sudo ufw enable
sudo ufw status
Important: ufw enable activates the firewall. On a fresh Hetzner VPS, UFW is installed but inactive by default — meaning ufw allow rules exist on paper but are not enforced until you run ufw enable. Always allow SSH (OpenSSH) before enabling, or you will lock yourself out.
If you’re also using Hetzner’s cloud firewall (check Hetzner Cloud Console → your server → Firewalls), add one inbound rule:
- Protocol: UDP, Port: 51820, Source: Any
Do not open ports 51821 (wg-easy GUI) or 80 (Pi-hole GUI) in either firewall.
Step 3: Generate a Password Hash
wg-easy requires a bcrypt hash rather than a plaintext password. Generate one:
docker run -it ghcr.io/wg-easy/wg-easy wgpw YOUR_PASSWORD_HERE
This outputs a hash like:
$2a$12$cIBKkxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Copy this hash — you’ll need it in Step 4. Remember the plaintext password you used; that’s what you’ll type to log in.
Step 4: Create the Docker Compose File
Create a project directory:
mkdir ~/vpn && cd ~/vpn
nano docker-compose.yml
Paste this configuration:
services:
wg-easy:
image: ghcr.io/wg-easy/wg-easy
container_name: wg-easy
environment:
- WG_HOST=YOUR_VPS_IP
- PASSWORD_HASH=YOUR_HASH_HERE
- WG_DEFAULT_DNS=10.8.1.3
- WG_ALLOWED_IPS=0.0.0.0/0
volumes:
- ~/.wg-easy:/etc/wireguard
ports:
- "51820:51820/udp"
- "127.0.0.1:51821:51821/tcp"
cap_add:
- NET_ADMIN
- SYS_MODULE
sysctls:
- net.ipv4.conf.all.src_valid_mark=1
- net.ipv4.ip_forward=1
restart: unless-stopped
networks:
vpn_net:
ipv4_address: 10.8.1.2
pihole:
image: pihole/pihole:latest
container_name: pihole
environment:
- WEBPASSWORD=CHOOSE_A_PIHOLE_PASSWORD
- DNSMASQ_LISTENING=all
- PIHOLE_DNS_=10.8.1.4#5053
volumes:
- ./pihole/etc-pihole:/etc/pihole
- ./pihole/etc-dnsmasq.d:/etc/dnsmasq.d
restart: unless-stopped
networks:
vpn_net:
ipv4_address: 10.8.1.3
dnscrypt:
image: klutchell/dnscrypt-proxy:latest
container_name: dnscrypt
volumes:
- ./dnscrypt/dnscrypt-proxy.toml:/config/dnscrypt-proxy.toml
restart: unless-stopped
networks:
vpn_net:
ipv4_address: 10.8.1.4
networks:
vpn_net:
ipam:
config:
- subnet: 10.8.1.0/24
Replace three things before saving:
| Placeholder | Replace with |
|---|---|
YOUR_VPS_IP | Your Hetzner VPS public IPv4 address |
YOUR_HASH_HERE | Your bcrypt hash from Step 3 |
CHOOSE_A_PIHOLE_PASSWORD | A password for Pi-hole’s admin dashboard |
CRITICAL: Escape the $ signs in your hash. Docker Compose interprets $ as variable references. Double every $ in the hash. For example:
# Original hash:
$2a$12$cIBKkxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# In docker-compose.yml (every $ becomes $$):
$$2a$$12$$cIBKkxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
If you skip this, the hash gets corrupted and you’ll get “Unauthorized” when trying to log in.
Save and exit: Ctrl+O, Enter, Ctrl+X.
How the DNS chain works:
WG_DEFAULT_DNS=10.8.1.3 points WireGuard clients to Pi-hole’s internal IP. Pi-hole resolves queries locally (blocking ads/trackers) and forwards the rest to dnscrypt-proxy (10.8.1.4), which encrypts them using DNS-over-HTTPS before sending to Quad9. The entire DNS chain is encrypted — no plaintext DNS in typical operation, barring misconfiguration or fallback conditions.
Your device → WireGuard tunnel (encrypted) → Pi-hole (blocks ads) → dnscrypt-proxy (encrypts DNS via DoH) → Quad9 (resolves)
What each setting does:
| Setting | Purpose |
|---|---|
WG_HOST | Your VPS public IP — clients connect to this |
PASSWORD_HASH | Bcrypt hash protecting the web admin panel |
WG_DEFAULT_DNS=10.8.1.3 | Points client DNS to Pi-hole |
WG_ALLOWED_IPS=0.0.0.0/0 | Route ALL client traffic through VPN |
PIHOLE_DNS_=10.8.1.4#5053 | Pi-hole forwards to dnscrypt-proxy (DNS-over-HTTPS proxy) |
DNSMASQ_LISTENING=all | Pi-hole accepts DNS queries from the Docker network |
dnscrypt-proxy upstream | Encrypts DNS queries to Quad9 (dns.quad9.net) via HTTPS |
| Port 51820/udp | WireGuard tunnel |
| Port 51821/tcp | wg-easy admin panel (only via SSH tunnel) |
10.8.1.0/24 network | Internal Docker network connecting the three containers |
Step 5: Create the dnscrypt-proxy Config
dnscrypt-proxy needs a config file to know which upstream DNS server to use:
mkdir -p ~/vpn/dnscrypt
nano ~/vpn/dnscrypt/dnscrypt-proxy.toml
Paste:
listen_addresses = ['0.0.0.0:5053']
server_names = ['quad9-doh-ip4-port443-nofilter-ecs-pri']
[sources]
[sources.'public-resolvers']
urls = ['https://raw.githubusercontent.com/DNSCrypt/dnscrypt-resolvers/master/v3/public-resolvers.md', 'https://download.dnscrypt.info/resolvers-list/v3/public-resolvers.md']
cache_file = '/config/public-resolvers.md'
minisign_key = 'RWQf6LRCGA9i53mlYecO4IzT51TGPpvWucNSCh1CBM0QTaLn73Y7GFO3'
Save and exit: Ctrl+O, Enter, Ctrl+X.
This tells dnscrypt-proxy to listen on port 5053 and forward all DNS queries to Quad9 over encrypted DNS-over-HTTPS.
Step 6: Launch Everything
cd ~/vpn
docker compose up -d
Verify all three containers are running:
docker ps
You should see wg-easy, pihole, and dnscrypt all with status Up.
Secure the WireGuard client keys and the Compose file (which contains your Pi-hole password):
chmod 700 ~/.wg-easy
chmod 600 ~/vpn/docker-compose.yml
If any container is in a Restarting state, check its logs:
docker logs wg-easy
docker logs pihole
docker logs dnscrypt
Step 7: Create Client Configs via SSH Tunnel
Since port 51821 is not exposed to the internet, you access the web GUI through an encrypted SSH tunnel.
Open a second terminal window on your laptop (keep your VPS session in the first) and run:
ssh -L 51821:localhost:51821 your-username@YOUR_VPS_IP
This forwards your laptop’s port 51821 through SSH to the VPS. Keep this terminal open.
Now open your browser and go to:
http://localhost:51821
- Enter the plaintext password you used in Step 3 (not the hash)
- Click "+ New"
- Name your first client (e.g.,
laptop,phone,tablet) - A config file and QR code are generated automatically
Repeat for each device you want to connect.
When you’re done, you can close the SSH tunnel (Ctrl+C). The VPN keeps running — you only need the tunnel when managing clients.
Step 8: Connect Your Devices
Laptop (macOS / Windows / Linux)
- Download the WireGuard app:
- macOS: App Store → search “WireGuard”
- Windows: https://www.wireguard.com/install/
- Linux:
sudo apt install wireguard
- In the wg-easy web GUI, click the download icon next to your laptop client
- This downloads a
.conffile - Open WireGuard app → “Import Tunnel from File” → select the
.conffile - Click Activate
Phone (Android / iPhone)
- Install the WireGuard app from Play Store or App Store
- In the wg-easy web GUI, click the QR code icon next to your phone client
- Open WireGuard app on phone → tap + → Scan from QR code
- Point camera at the QR code on your screen
- Toggle the tunnel on
Phone tips:
- Android: Add a Quick Settings tile (swipe down → edit → drag WireGuard tile) for one-tap toggling. You can also exclude apps from the VPN: tunnel settings → Excluded Applications → select banking/UPI apps.
- iPhone: No per-app exclusion (iOS limitation). Use On-Demand rules instead: tunnel settings → On Demand → auto-activate on untrusted wifi, deactivate on home wifi. Toggle VPN off manually when using banking apps.
Step 9: Verify Everything Works
Check your IP:
Visit https://whatismyipaddress.com — it should show a German IP address (Hetzner’s range), not your home ISP.
Or from terminal:
curl ifconfig.me
Check for DNS leaks:
Visit https://dnsleaktest.com — click Extended Test. The results should show a single server in Germany. It may display as Cloudflare Frankfurt rather than Quad9 — this can be normal due to routing through shared infrastructure, but it can also indicate a misconfiguration. If you see unexpected results, verify with dnscrypt-proxy logs: docker logs dnscrypt 2>&1 | tail -20. The important thing is: one server, in Germany, not your home ISP’s DNS servers.
Check Pi-hole is blocking ads:
Visit https://ads-blocker.com/testing/ — most test ads should be blocked.
Or from terminal:
nslookup ads.google.com
If Pi-hole is working, this returns 0.0.0.0 or NXDOMAIN (blocked). A real IP address means Pi-hole isn’t intercepting DNS.
Check for WebRTC leaks:
Visit https://browserleaks.com/webrtc — WebRTC can in some configurations bypass VPNs and expose your real IP through your browser. Modern browsers mitigate this with mDNS, but check anyway. If your real IP appears here, disable WebRTC in browser settings (Arkenfox does this automatically).
All four should confirm:
- IP → German (Hetzner)
- DNS → Quad9 (Swiss)
- Ads → Blocked (Pi-hole)
- WebRTC → No leak
Check the Pi-hole dashboard:
Open another SSH tunnel:
ssh -L 8080:10.8.1.3:80 your-username@YOUR_VPS_IP
Open http://localhost:8080/admin in your browser (use the password you set in WEBPASSWORD in the Compose file). Browse normally for a minute, then refresh — you should see queries climbing and often 20-40% being blocked (varies by device and usage).
Verify the upstream DNS is correct. Go to Settings → DNS in the Pi-hole dashboard. The only upstream server should be 10.8.1.4#5053 (your dnscrypt-proxy container). If Google or anything else is ticked, untick it. Enter 10.8.1.4#5053 in the Custom DNS field if it’s not already set, and hit Save.
Expected performance:
- Ping: 200-300ms (normal for your location → Germany round trip)
- Download: close to your raw ISP speed (minus encryption overhead)
- A VPN will not increase your speed, but may help if your ISP throttles specific services
Step 10: Add Pi-hole Blocklists (Optional)
Pi-hole comes with a default blocklist. For more comprehensive blocking, add these in the Pi-hole admin dashboard → Adlists:
https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
https://raw.githubusercontent.com/hagezi/dns-blocklists/main/adblock/pro.txt
After adding, go to Tools → Update Gravity to activate them.
Step 11: Harden the VPS (Recommended)
Disable Pi-hole Logging
For consistency with your minimal-logging approach, disable Pi-hole’s query log (it’s separate from system logs):
- In the Pi-hole dashboard, go to Settings → Privacy
- Set the privacy level to Anonymous mode (highest level)
- Hit Save
Then disable the long-term query database:
docker exec pihole bash -c "echo 'MAXDBDAYS=0' >> /etc/pihole/pihole-FTL.conf"
docker restart pihole
Ad blocking still works — Pi-hole doesn’t need logs to block. You just lose the dashboard’s historical stats. If you need to debug a blocked site later, temporarily re-enable logging.
Note on Pi-hole password: Since the dashboard is only accessible via SSH tunnel (already authenticated), a Pi-hole password is optional. To remove it: docker exec pihole pihole setpassword and press Enter twice when prompted.
Reduce System Logging
Prevent persistent logs of network activity on the VPS:
sudo nano /etc/systemd/journald.conf
Add under [Journal]:
Storage=volatile
MaxRetentionSec=1day
Save (Ctrl+O, Enter, Ctrl+X), then restart:
sudo systemctl restart systemd-journald
Storage=volatile keeps logs in RAM only — nothing on disk (though still accessible to a privileged process while the system is running), nothing survives a reboot. MaxRetentionSec=1day discards in-memory logs after 24 hours. This trades forensic visibility for reduced data retention — if an intermittent issue or intrusion occurs, you may have no logs to investigate. This is a deliberate choice, not an oversight.
Automatic Security Updates
Install unattended-upgrades to auto-install security patches daily:
sudo apt install unattended-upgrades
sudo dpkg-reconfigure -plow unattended-upgrades
Select Yes when the dialog appears. Verify it’s running:
sudo systemctl status unattended-upgrades
Reboot gap: unattended-upgrades installs patches but does not reboot the server. Many security fixes — especially kernel updates — only take effect after a reboot. A server can report “up to date” while still running a vulnerable kernel from months ago. Either reboot manually after kernel updates (check with needrestart or cat /var/run/reboot-required), or install needrestart to be alerted when a reboot is needed:
sudo apt install needrestart
Change SSH Port (Optional)
Moving SSH off the default port 22 to a random high port reduces commodity scanning noise (but does not prevent targeted scanning):
sudo nano /etc/ssh/sshd_config
Find the line #Port 22 (or Port 22) and change it to:
Port 48922
Save, then update the firewall before restarting SSH:
sudo ufw allow 48922/tcp
sudo ufw status # Verify new port is listed
sudo systemctl restart sshd
Test the new port in a separate terminal before closing your current session:
ssh -p 48922 your-username@YOUR_VPS_IP
If that works, remove the old port:
sudo ufw delete allow 22/tcp
After this change, all SSH commands need -p 48922:
# Regular SSH:
ssh -p 48922 your-username@YOUR_VPS_IP
# SSH tunnels for admin panels:
ssh -p 48922 -L 51821:localhost:51821 your-username@YOUR_VPS_IP
ssh -p 48922 -L 8080:10.8.1.3:80 your-username@YOUR_VPS_IP
If you’re using Hetzner’s cloud firewall, add an inbound rule for TCP port 48922.
Install Fail2ban
Protects SSH from brute-force attacks by banning IPs after repeated failed login attempts:
sudo apt install fail2ban
sudo systemctl enable fail2ban
sudo systemctl start fail2ban
Check how many IPs it’s currently blocking:
sudo fail2ban-client status sshd
Switch to SSH Key-Only Authentication (Optional)
If you’re still using password login for SSH, key-based auth is more secure:
sudo nano /etc/ssh/sshd_config
Set:
PasswordAuthentication no
PermitRootLogin no
AllowUsers YOUR_USER
PermitRootLogin no prevents direct root login even if the root account has a password. AllowUsers restricts SSH to only your username — any other system account is locked out entirely. Replace YOUR_USER with your actual username.
Save, then restart:
sudo systemctl restart sshd
Only do this after confirming your SSH key works, or you’ll lock yourself out. Test by opening a second terminal and SSH-ing in before closing your current session.
Split Tunneling (Optional)
By default, ALL traffic routes through Germany. This can cause issues with banking sites and payment apps that flag foreign IPs.
On Android
In the WireGuard app → tap your tunnel → Edit → Excluded Applications → select your banking apps, payment apps (digital wallets, UPI, etc.), and regional streaming apps.
On iPhone
No per-app exclusion available. Toggle VPN off manually for banking, or set up On-Demand rules to auto-disable on your home wifi.
On Laptop
WireGuard’s AllowedIPs is an allow-list, not a deny-list — there is no simple way to exclude specific IP ranges. The common trick of using 0.0.0.0/1, 128.0.0.0/1 still covers the entire IPv4 space (it overrides the default route via more-specific routes, but excludes nothing). For laptops, the practical approach is to toggle the VPN off when you need banking or government portals, then toggle it back on. This is less elegant than Android’s per-app exclusion, but it is the honest answer for WireGuard on desktop.
Syncthing — File Sync Across Devices (Optional)
Syncthing syncs files between your devices peer-to-peer with the VPS acting as an always-on peer (and relay fallback if needed). Useful for keeping your Obsidian vault, research papers, teaching materials, or any folder in sync across your Mac and phone.
Add Syncthing to Docker Compose
In ~/vpn/docker-compose.yml, add this service (same indentation level as the other services):
syncthing:
image: syncthing/syncthing:latest
container_name: syncthing
environment:
- PUID=1000
- PGID=1000
volumes:
- ~/syncthing/config:/var/syncthing/config
- ~/syncthing/data:/var/syncthing/data
ports:
- "22000:22000/tcp"
- "22000:22000/udp"
- "21027:21027/udp"
restart: unless-stopped
networks:
vpn_net:
ipv4_address: 10.8.1.5
Create directories and set permissions
Note on port exposure: Unlike other services in this guide, Syncthing’s sync ports (22000, 21027) are bound to all interfaces, not just localhost or the Tailscale IP. This is intentional — Syncthing needs to accept direct connections from your other devices to enable peer-to-peer sync. The web GUI (port 8384) is not exposed and is accessible only via SSH tunnel. The sync protocol itself is encrypted and authenticated; open sync ports do not expose your files.
mkdir -p ~/syncthing/config ~/syncthing/data
sudo chown -R 1000:1000 ~/syncthing
Open firewall ports and launch
sudo ufw allow 22000/tcp
sudo ufw allow 22000/udp
sudo ufw allow 21027/udp
cd ~/vpn
docker compose up -d
Verify:
docker ps | grep syncthing
Should show status Up, not Restarting.
Access the Syncthing Dashboard
Important: All SSH tunnel commands must be run from your local machine’s terminal (Mac/laptop), NOT from inside an existing SSH session to the VPS.
Install Syncthing on Your Mac
brew install syncthing
brew services start syncthing
Or download from https://syncthing.net/downloads/.
Syncthing’s GUI may not always run on port 8384. Find the actual port:
lsof -i -P | grep syncthing
Look for a line with TCP localhost:XXXXX (LISTEN) — that’s the port. Open http://localhost:XXXXX in your browser to access your Mac’s Syncthing dashboard.
Access the VPS Syncthing Dashboard
Since the Mac’s Syncthing may already be using port 8384, use a different local port for the VPS tunnel. Run this from a terminal on your Mac (not inside an SSH session):
ssh -L 8385:10.8.1.5:8384 your-username@YOUR_VPS_IP
Note: This tunnel targets 10.8.1.5 (Syncthing’s IP on the Docker bridge network), not 127.0.0.1 like other SSH tunnels in this guide. That’s because Syncthing’s web GUI (port 8384) is not exposed in the Docker Compose file — it’s only accessible inside the Docker network. SSH on the VPS host can route into Docker bridge networks, so this works, but the access pattern is different from the other admin panels. Open your browser and go to:
http://localhost:8385
You now have two dashboards:
http://localhost:XXXXX— your Mac’s Syncthing (the port you found above)http://localhost:8385— your VPS’s Syncthing (via SSH tunnel)
On first launch, the VPS dashboard will prompt you to set a GUI password. Set one — even though it’s behind an SSH tunnel, it’s good practice.
Connect the Devices
- Get the VPS Device ID: In the VPS Syncthing dashboard (localhost:8385), go to Actions > Show ID – copy it
- Add VPS to Mac: In your Mac’s Syncthing dashboard (localhost:XXXXX), click Add Remote Device > paste the VPS Device ID > Save
- Accept on VPS: The VPS dashboard will show a notification to accept the new device – click Add Device > Save
- Wait for connection: Both dashboards should show the other device as “Connected” (green)
Share Your First Folder
- On the VPS dashboard (localhost:8385), click Add Folder
- Folder Label:
Obsidian - Folder Path:
/var/syncthing/data/obsidian - Click the Sharing tab > tick your Mac
- Click the File Versioning tab > select Staggered File Versioning (keeps deleted/changed files for 30 days on the VPS – see below)
- Click Save
- On the Mac dashboard (localhost:XXXXX), a notification will appear – click Add > set the Folder Path to your existing vault location (e.g.,
~/Documents/Obsidian) > Save
The initial sync will copy everything from your Mac to the VPS. Don’t open Obsidian until the sync finishes – you can watch progress on either dashboard.
Enable Staggered File Versioning
Do this on the VPS side for every shared folder before the first sync. Staggered File Versioning keeps old versions of deleted or changed files in a .stversions folder on the VPS with decreasing frequency:
- Every version for the first 24 hours
- One version per day for the first 30 days
- One version per week for the first 6 months
- One version per year after that
This means if you accidentally delete a file on your Mac, the deletion syncs to the VPS, but the old version is preserved in .stversions and can be recovered. Set this on the VPS rather than the Mac so backup copies live on the server, not on your laptop.
To enable: click the folder on the VPS dashboard > Edit > File Versioning tab > select Staggered File Versioning > Save.
Syncthing syncs deletions. If you delete a file on your Mac, it’s deleted on the VPS too. Staggered versioning is your safety net – without it, deletions are permanent and immediate.
Prevent Sync Conflicts
In your synced folder, create a file called .stignore to exclude files that change per-device and cause conflicts:
.obsidian/workspace.json
.obsidian/workspace-mobile.json
.trash
Adding More Folders
Each folder you want to sync is added as a separate shared folder in Syncthing. Repeat the same process for each:
- VPS dashboard > Add Folder > set path (e.g.,
/var/syncthing/data/papers) > give it a label > Sharing tab > tick Mac > File Versioning tab > Staggered File Versioning > Save - Mac dashboard > accept the notification > point to your local folder (e.g.,
~/Documents/Papers) > Save
Takes about 30 seconds per folder. Keeping folders separate (rather than syncing one parent folder) lets you control which devices get which folders – e.g., Obsidian on your phone, but not teaching materials.
iPhone
There’s no official Syncthing app for iOS. Use Möbius Sync from the App Store (~$5 one-time) – it’s a third-party Syncthing client that works with the same protocol.
Part 2: Tailscale + Storage Box
The VPN handles encrypted browsing. Tailscale handles private access to services. The distinction matters: WireGuard routes your internet traffic through Germany; Tailscale creates a mesh network that lets your devices reach services on the VPS without opening any ports to the public internet. They complement each other.
Tailscale Mesh Network
Install on the VPS
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up
Follow the authentication URL. Note your Tailscale IP:
tailscale ip -4
Returns something like 100.x.x.x. Services bound to this IP are accessible only from devices on your Tailscale network.
Install on Other Devices
- macOS:
brew install tailscaleor download from https://tailscale.com/download - iPhone/Android: Install from App Store / Play Store
- Linux: Same curl command as above
Sign in with the same account everywhere.
Verify
From your laptop (with Tailscale running):
ping YOUR_TAILSCALE_IP
If it responds, your mesh is working. Any service bound to this IP on the VPS is now accessible from your devices only.
A Note on Access Paths
Services bound to the Tailscale IP are reachable two ways:
- Tailscale on — your device connects directly to the VPS via the mesh network. Works regardless of whether WireGuard is active.
- WireGuard on — your device routes all traffic through the VPS. Since the Tailscale IP is a local interface on the VPS, requests to it resolve locally on the server. This works even without Tailscale running on your device.
The practical difference: WireGuard routes everything through Germany (browsing, streaming, all traffic). Tailscale connects only to your services. If you’ve toggled WireGuard off — for banking, for government portals, for streaming — Tailscale still gives you access to Kiwix, the search app, Stirling PDF, and everything else on the VPS without rerouting your entire internet connection.
A Note on Tailscale’s Trust Model
Tailscale is not self-hosted. It uses a coordination server operated by Tailscale Inc. (a US company) to manage device identity and key exchange. The actual traffic between your devices is peer-to-peer and encrypted — Tailscale’s servers cannot read the encrypted contents, though relay servers can observe connection metadata — but the coordination server knows which devices are on your network and when they’re online. If your Tailscale account is compromised (e.g., via a compromised Google or Microsoft login), an attacker gains network-level access to every service on your mesh. Mitigation: enable multi-factor authentication on your Tailscale account, and prefer a login provider that supports hardware security keys. For those who want to eliminate this dependency entirely, Headscale is an open-source, self-hosted alternative to Tailscale’s coordination server — but it adds significant operational complexity for marginal benefit at the personal infrastructure scale.
Mount Hetzner Storage Box via SSHFS
The Storage Box provides 1 TB of remote storage. Mounting it via SSHFS makes it appear as a local directory, usable by Docker containers and scripts. This is where the Gutenberg library and other large files live — the VPS’s 40 GB local disk is too small for a 200 GB book collection.
Install and Mount
sudo apt install sshfs
sudo mkdir -p /mnt/storagebox
If you already set up an SSH key for BorgBackup, reuse it. Otherwise:
ssh-keygen -t ed25519 -f ~/.ssh/storagebox -N ""
echo "put ~/.ssh/storagebox.pub .ssh/authorized_keys" | sftp -P 23 uXXXXXX@uXXXXXX.your-storagebox.de
Mount:
sudo sshfs -o allow_other,_netdev,IdentityFile=~/.ssh/storagebox,Port=23 \
uXXXXXX@uXXXXXX.your-storagebox.de:/ /mnt/storagebox
Auto-Mount on Boot
Add to /etc/fstab:
uXXXXXX@uXXXXXX.your-storagebox.de:/ /mnt/storagebox fuse.sshfs _netdev,allow_other,IdentityFile=/home/YOUR_USER/.ssh/storagebox,Port=23,x-systemd.automount,reconnect 0 0
Test: sudo mount -a
Performance Note
SSHFS adds a network round trip to every file read. For sequential access — streaming a book, downloading a file — this is negligible. For random access — searching a ZIM archive, querying a SQLite database — it’s painfully slow. Rule of thumb: anything that needs fast random reads (databases, search indexes) goes on local VPS disk. Everything else (books, backups, large archives) goes on the Storage Box.
Mount drop warning: If the SSH connection to the Storage Box drops (network interruption, Hetzner maintenance), any process trying to read from /mnt/storagebox will hang — potentially entering uninterruptible I/O wait (D state), which can make the entire VPS feel frozen even if the CPU is idle. The reconnect option in the fstab entry helps, but doesn’t prevent brief hangs during reconnection. If your VPS becomes unresponsive, check the mount first: df -h /mnt/storagebox. If it hangs, the mount is stale — unmount and remount: sudo umount -l /mnt/storagebox && sudo mount -a. Services that depend on the Storage Box (Kiwix) will be unavailable during the interruption; services on local disk (Gutenberg Search, Miniflux, Pi-hole) are unaffected.
BorgBackup — Encrypted Nightly Backups to Hetzner Storage Box (Optional)
BorgBackup sends encrypted, compressed, deduplicated backups of your entire VPS configuration to a Hetzner Storage Box every night. If your VPS dies, you can rebuild everything from the backup.
Prerequisites
- A Hetzner Storage Box with SSH enabled in its settings panel
- Your Storage Box username (format:
uXXXXXX) and hostname (format:uXXXXXX.your-storagebox.de)
Install BorgBackup
sudo apt install borgbackup
Set Up SSH Key Authentication
If you already generated and uploaded a Storage Box key during SSHFS setup above, skip this step — it’s the same key. If not, generate one:
ssh-keygen -t ed25519 -f ~/.ssh/storagebox -N ""
Upload the public key via SFTP (Hetzner doesn’t allow ssh-copy-id on Storage Boxes):
echo "put ~/.ssh/storagebox.pub .ssh/authorized_keys" | sftp -P 23 uXXXXXX@uXXXXXX.your-storagebox.de
Warning: This put command overwrites the authorized_keys file. If you have already uploaded a key (e.g., during SSHFS setup), running this again with a different key will revoke the previous one. If using the same key for both, skip this step entirely.
Enter your Storage Box password when prompted. If the .ssh directory doesn’t exist, connect manually first:
sftp -P 23 uXXXXXX@uXXXXXX.your-storagebox.de
mkdir .ssh
chmod 700 .ssh
exit
Then run the upload command again.
Test the connection:
ssh -i ~/.ssh/storagebox -p 23 uXXXXXX@uXXXXXX.your-storagebox.de
You’ll get a “restricted shell” message — that’s normal. As long as it doesn’t ask for a password, key auth is working.
Initialize the Borg Repository
export BORG_RSH="ssh -i /home/YOUR_USER/.ssh/storagebox"
borg init --encryption=repokey ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./backups
Choose a strong passphrase. Write it down somewhere safe — you need it to restore backups.
Export the Encryption Key
If the Storage Box dies, you lose the repo key and can’t decrypt your backups even with the passphrase. Export a backup of the key:
export BORG_RSH="ssh -i /home/YOUR_USER/.ssh/storagebox"
borg key export ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./backups ~/borg-key-backup.txt
cat ~/borg-key-backup.txt
Save this key file somewhere safe (password manager, printed on paper). You need both the passphrase and this key to restore. Lose either one and your backups are unrecoverable.
Create the Backup Script
nano ~/backup.sh
#!/bin/bash
export BORG_RSH="ssh -i /home/YOUR_USER/.ssh/storagebox"
export BORG_REPO="ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./backups"
export BORG_PASSPHRASE='YOUR_PASSPHRASE_HERE'
# Dump Miniflux database before backup
docker exec miniflux-db pg_dump -U miniflux miniflux > /home/YOUR_USER/miniflux/db-backup.sql
# Create backup
sudo --preserve-env borg create \
--compression zstd \
::vps-{now:%Y-%m-%d-%H%M} \
/home/YOUR_USER/vpn/docker-compose.yml \
/home/YOUR_USER/vpn/dnscrypt \
/home/YOUR_USER/vpn/pihole \
/home/YOUR_USER/.wg-easy \
/home/YOUR_USER/syncthing \
/home/YOUR_USER/miniflux \
/home/YOUR_USER/kiwix \
/home/YOUR_USER/gutenberg-search \
/home/YOUR_USER/stirling-pdf \
/home/YOUR_USER/.ssh \
/var/www/hugo \
/var/www/other \
/etc/nginx
# Prune old backups: keep 7 daily, 4 weekly, 6 monthly
sudo --preserve-env borg prune \
--keep-daily 7 \
--keep-weekly 4 \
--keep-monthly 6
# Free up space from pruned backups
sudo --preserve-env borg compact
# Fix cache permissions (sudo changes ownership to root)
sudo chown -R YOUR_USER:YOUR_USER /home/YOUR_USER/.cache/borg /home/YOUR_USER/.config/borg
Replace uXXXXXX with your Storage Box username and YOUR_PASSPHRASE_HERE with your passphrase. Use single quotes around the passphrase to prevent bash from interpreting special characters.
Make it executable and restrict permissions (the file contains your passphrase):
chmod 700 ~/backup.sh
Test the Backup
~/backup.sh
First run takes a minute or two. Verify:
export BORG_RSH="ssh -i /home/YOUR_USER/.ssh/storagebox"
export BORG_REPO="ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./backups"
export BORG_PASSPHRASE='YOUR_PASSPHRASE_HERE'
borg list
Should show an archive like vps-2026-02-22-1824.
Automate with Cron
crontab -e
Add:
0 3 * * * /home/YOUR_USER/backup.sh >> /home/YOUR_USER/backup.log 2>&1
This runs the backup every night at 3 AM UTC. Check ~/backup.log if you want to verify it ran.
Additional cron jobs (added after Miniflux and Telegram setup):
# Daily reading digest at 6:00 AM UTC
0 6 * * * ~/miniflux/scripts/run.sh miniflux-telegram-digest.py >> ~/miniflux/scripts/digest.log 2>&1
# Hourly health check — alerts only on problems
0 * * * * ~/miniflux/scripts/run.sh vps-health-monitor.py >> ~/miniflux/scripts/health.log 2>&1
# Daily health summary at 7:00 AM UTC
0 7 * * * ~/miniflux/scripts/run.sh vps-health-monitor.py --daily >> ~/miniflux/scripts/health.log 2>&1
# Weekly log rotation — prevents logs from growing indefinitely
0 0 * * 0 tail -500 ~/miniflux/scripts/health.log > ~/miniflux/scripts/health.log.tmp && mv ~/miniflux/scripts/health.log.tmp ~/miniflux/scripts/health.log
0 0 * * 0 tail -200 ~/miniflux/scripts/digest.log > ~/miniflux/scripts/digest.log.tmp && mv ~/miniflux/scripts/digest.log.tmp ~/miniflux/scripts/digest.log
0 0 * * 0 tail -200 ~/backup.log > ~/backup.log.tmp && mv ~/backup.log.tmp ~/backup.log
See Part 5 for full setup.
What Gets Backed Up
| Path | Contents |
|---|---|
docker-compose.yml | All container configurations |
~/vpn/dnscrypt | dnscrypt-proxy config |
~/vpn/pihole | Pi-hole settings and blocklists |
~/.wg-easy | WireGuard client configs |
~/syncthing | Syncthing config and synced data |
~/.ssh | All SSH keys (including Storage Box key) |
/var/www/hugo | Hugo website files |
/var/www/other | Other website files |
/etc/nginx | Nginx configs for both sites |
~/miniflux | Miniflux docker-compose, scripts, .env, database dump |
~/kiwix | Kiwix docker-compose (ZIM files live on Storage Box, not backed up here) |
~/gutenberg-search | Search app source, Dockerfile, docker-compose |
~/stirling-pdf | Stirling PDF docker-compose |
Restoring from Backup
To see what’s in a backup:
borg list ::vps-2026-02-22-1824
To restore everything to a temporary directory:
mkdir ~/restore
cd ~/restore
borg extract ::vps-2026-02-22-1824
To restore a specific file:
borg extract ::vps-2026-02-22-1824 home/YOUR_USER/vpn/docker-compose.yml
Backup Retention
Borg keeps:
- Last 7 daily backups
- Last 4 weekly backups
- Last 6 monthly backups
Older backups are pruned automatically. Deduplication means only changes are stored, so space usage stays small.
Test Your Restores
A backup that has never been tested is a hope, not a plan. Every three months, verify that recovery actually works:
- Spin up a temporary VPS (Hetzner bills hourly — a one-hour test costs cents)
- Install Borg:
sudo apt install borgbackup - Pull the latest archive and extract to a test directory
- Confirm configs, scripts, and database dumps are intact
- Delete the test VPS
This takes thirty minutes and confirms that your nightly backups are not silently failing, corrupting, or missing critical paths. Add a calendar reminder.
Backup Security
These backups contain SSH keys, the Borg passphrase, API tokens, Docker Compose files, and the complete infrastructure configuration. Anyone with access to a backup archive and the Borg passphrase has the equivalent of root access to your entire infrastructure. Treat backup archives with the same care as your SSH private keys — they are, in effect, a portable copy of your server’s identity.
Maintenance
Managing Containers
Each service runs in its own Docker Compose stack. docker compose commands only affect the stack in the current directory — running docker compose down from ~/vpn stops the VPN stack, not Miniflux or Kiwix.
VPN stack (from ~/vpn):
docker compose down # Stop VPN stack only
docker compose up -d # Start VPN stack
docker compose restart # Restart VPN stack
docker logs wg-easy # wg-easy logs
docker logs pihole # Pi-hole logs
docker logs dnscrypt # dnscrypt-proxy logs
docker logs syncthing # Syncthing logs
Other stacks — same commands, different directories:
cd ~/miniflux && docker compose down && docker compose up -d
cd ~/kiwix && docker compose down && docker compose up -d
cd ~/gutenberg-search && docker compose down && docker compose up -d
cd ~/stirling-pdf && docker compose down && docker compose up -d
docker logs and docker ps are container-global — they work from any directory.
Updating
cd ~/vpn
docker compose pull # Pull latest images
docker compose down
docker compose up -d
Repeat for each stack (~/miniflux, ~/kiwix, ~/gutenberg-search, ~/stirling-pdf).
Client configs are preserved in ~/.wg-easy/. Pi-hole settings are preserved in ~/vpn/pihole/. dnscrypt-proxy config is preserved in ~/vpn/dnscrypt/. Syncthing config and data are preserved in ~/syncthing/.
Accessing Admin Panels
All panels require SSH tunnels. Run these commands from a terminal on your local machine (Mac/laptop), NOT from inside an SSH session to the VPS:
# wg-easy (manage VPN clients):
ssh -L 51821:localhost:51821 your-username@YOUR_VPS_IP
# Then open: http://localhost:51821
# Pi-hole (view blocked queries, manage blocklists):
ssh -L 8080:10.8.1.3:80 your-username@YOUR_VPS_IP
# Then open: http://localhost:8080/admin
# Syncthing (manage synced folders and devices):
ssh -L 8385:10.8.1.5:8384 your-username@YOUR_VPS_IP
# Then open: http://localhost:8385
# Miniflux (RSS reader):
ssh -L 8090:127.0.0.1:8090 YOUR_USER@YOUR_VPS_IP -N
# Then open: http://localhost:8090
View Connected Clients
Via the wg-easy web GUI, or from the VPS terminal:
docker exec wg-easy wg show
Troubleshooting
| Problem | Fix |
|---|---|
| “Unauthorized” on wg-easy login | The bcrypt hash was corrupted. Make sure every $ in the hash is doubled ($$) in docker-compose.yml. Recreate with docker compose down && docker compose up -d |
| Can’t reach web GUI | Make sure your SSH tunnel is running: ssh -L 51821:localhost:51821 user@VPS_IP, then open http://localhost:51821 |
| Container stuck in “Restarting” | Check logs: docker logs wg-easy, docker logs pihole, or docker logs dnscrypt |
| Client connects but no internet | Check docker ps — all three containers must be Up. Restart with docker compose restart |
| Ads still showing | Some ads (YouTube, Facebook) are served from the same domain as content and can’t be DNS-blocked. Use uBlock Origin in your browser for those |
| Slow speeds | 200-300ms ping is normal for your location → Germany. Download speeds should be close to your ISP speed. Toggle VPN off for latency-sensitive tasks |
| Banking app blocked | Exclude the app from VPN (Android) or toggle VPN off temporarily (iPhone) |
| “Handshake did not complete” | Firewall blocking UDP 51820 — check both ufw and Hetzner cloud firewall |
| Container not starting after reboot | Ensure Docker is enabled: sudo systemctl enable docker |
| Can’t SSH after port change | Use ssh -p 48922 user@VPS_IP. If locked out, use Hetzner’s web console to fix /etc/ssh/sshd_config |
| Syncthing permission denied crash loop | Run sudo chown -R 1000:1000 ~/syncthing then cd ~/vpn && docker compose restart syncthing |
| SSH tunnel “Address already in use” | An old tunnel is still running. Find it with sudo lsof -i :PORT and kill the PID. Then retry the tunnel |
| Mac Syncthing dashboard not on port 8384 | Run lsof -i -P | grep syncthing and look for TCP localhost:XXXXX (LISTEN) — open that port in your browser |
| Syncthing “no configuration file provided” | You’re not in the right directory. Run cd ~/vpn first, then docker compose restart syncthing |
| Borg “Permission denied” on cache/config | Run sudo chown -R YOUR_USER:YOUR_USER /home/YOUR_USER/.cache/borg /home/YOUR_USER/.config/borg |
| Borg “passphrase is incorrect” | Special characters in passphrase being interpreted by bash. Use single quotes around the passphrase in backup.sh |
| Borg “stale lock” messages | Normal after a failed run. Borg cleans them up automatically on the next run |
| Pi-hole dashboard shows no queries | Client configs may still use old DNS. Delete and recreate clients in wg-easy, re-scan QR codes |
| DNS leak test shows Cloudflare | May appear due to shared or proxied infrastructure — verify with docker logs dnscrypt that you see [quad9-doh-ip4-port443-nofilter-ecs-pri] OK (DoH). If Quad9 is confirmed in logs, the test result is cosmetic |
| Pi-hole upstream shows Google | The environment variable didn’t take. Go to Pi-hole dashboard → Settings → DNS → untick Google → enter 10.8.1.4#5053 as Custom DNS → Save |
Stirling PDF — Self-Hosted PDF Toolkit
A browser-based PDF toolkit running on your Tailscale mesh. Merge, split, rotate, compress, convert, OCR, watermark, sign, add page numbers, extract images — 50+ operations. Files never leave your server. Replaces every online PDF tool (ILovePDF, SmallPDF, Adobe Acrobat) and the sketchy free ones.
Install
mkdir -p ~/stirling-pdf
nano ~/stirling-pdf/docker-compose.yml
services:
stirling-pdf:
image: stirlingtools/stirling-pdf:latest
container_name: stirling-pdf
volumes:
- stirling-data:/configs
- stirling-tessdata:/usr/share/tessdata
ports:
- "127.0.0.1:8484:8080"
- "YOUR_TAILSCALE_IP:8484:8080"
environment:
- SECURITY_ENABLELOGIN=false
restart: unless-stopped
volumes:
stirling-data:
stirling-tessdata:
Replace YOUR_TAILSCALE_IP with your Tailscale IP.
cd ~/stirling-pdf
docker compose up -d
Access: http://YOUR_TAILSCALE_IP:8484
No login required — security is handled by Tailscale (only your devices can reach it). Files are processed in memory and deleted after download.
OCR Languages
English OCR works out of the box. To add other languages (e.g., Hindi, German):
docker exec stirling-pdf bash -c "cd /usr/share/tessdata && \
wget https://github.com/tesseract-ocr/tessdata/raw/main/hin.traineddata && \
wget https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata"
API Usage
Stirling PDF exposes a REST API for every operation. Useful for batch processing from scripts:
# Compress a PDF
curl -F 'fileInput=@syllabus.pdf' \
http://YOUR_TAILSCALE_IP:8484/api/v1/general/compress-pdf \
-o syllabus-compressed.pdf
# Merge two PDFs
curl -F 'fileInput=@part1.pdf' -F 'fileInput=@part2.pdf' \
http://YOUR_TAILSCALE_IP:8484/api/v1/general/merge-pdfs \
-o combined.pdf
Architecture Summary
Your Devices (laptop, phone, tablet)
│
├── WireGuard tunnel (UDP 51820) ──→ VPS ──→ Internet
│ Encrypted browsing, ad blocking, DNS privacy
│
└── Tailscale mesh ──→ VPS services (private access only)
│
├── :8888 Kiwix (60,000+ book library)
├── :8585 Gutenberg Search (catalog search + export)
├── :8484 Stirling PDF (PDF toolkit)
├── :8090 Miniflux (RSS reader)
└── (SSH tunnel only)
├── :51821 wg-easy admin
├── :80 Pi-hole dashboard
└── :8384 Syncthing dashboard
Hetzner VPS — 2 cores, 4 GB RAM, 40 GB SSD
├── Docker containers
│ wg-easy, pihole, dnscrypt, syncthing, miniflux,
│ miniflux-db, kiwix, gutenberg-search, stirling-pdf
├── Cron jobs
│ BorgBackup (3 AM), Telegram digest (6 AM),
│ health monitor (hourly + daily summary)
├── journald (volatile, 1-day retention)
├── fail2ban, unattended-upgrades
└── Tailscale daemon
Hetzner Storage Box — 1 TB, mounted at /mnt/storagebox via SSHFS
├── /kiwix/ 30 Gutenberg ZIM files (~200 GB)
├── /backups/ BorgBackup archives (encrypted)
└── (other files) Personal documents, Zotero, photos
Observability (reduced, not eliminated):
Your ISP sees: encrypted UDP packets to a German IP (not their contents)
Hetzner sees: encrypted DNS-over-HTTPS leaving the VPS (not query contents)
Quad9 sees: DNS queries without your ISP identity, but associated with your VPS IP
Websites see: a German Hetzner IP without prior association to your personal ISP identity
No single ordinary service provider holds the complete picture — but traffic analysis,
TLS metadata, and legal orders remain possible
Self-Hosted WireGuard vs Commercial VPN
| Your WireGuard + Pi-hole | Commercial VPN (e.g., ProtonVPN) | |
|---|---|---|
| Privacy from ISP | Full — ISP sees encrypted UDP to Germany | Full — ISP sees encrypted traffic to VPN server |
| Privacy from VPN provider | No provider — you control the server | Trust provider’s no-logs policy |
| Anonymity | None — VPS provider knows your identity, static IP is only yours | Low-Medium — account tied to email/payment, but shared IPs |
| Ad/tracker blocking | Full — Pi-hole blocks across all apps, custom blocklists | Partial — some offer DNS filtering but less configurable |
| DNS privacy | Full — Pi-hole → dnscrypt-proxy → Quad9, all encrypted, self-controlled | Provider handles DNS on their servers — you trust them |
| DNS encryption | Encrypted in transit to resolver — no plaintext DNS in typical operation, barring misconfiguration or fallback conditions | Encrypted within tunnel, but provider resolves on their end |
| Legal protection | Weak — VPS provider complies with court orders, all traffic is yours | Stronger — shared IPs, no-logs policies, privacy-friendly jurisdictions |
| Torrenting safety | Risky — static IP, host country copyright enforcement applies | Strong — shared IPs, dedicated P2P servers |
| Server locations | 1 (wherever your VPS is) | 60+ countries |
| Simultaneous devices | Unlimited | Plan-dependent (typically 5-10) |
| Logging | None — you control and disable all logging | None claimed — depends on provider’s policy and audits |
| Control | Full — you manage every component | None — provider makes all infrastructure decisions |
| Reliability | Single server — if VPS goes down, VPN is gone | Redundant infrastructure across hundreds of servers |
| Kill switch | Manual config required | Built into app |
| Cost | ₹0 additional (runs on existing VPS) | ₹300-800/month depending on provider and plan |
| Setup/maintenance | You manage updates, troubleshooting, Docker containers | Zero maintenance |
| Best used for | Daily browsing, ad blocking, DNS privacy, self-hosted infrastructure | Torrenting, geo-shifting, backup when VPS is down |
Recommendation: Run both. Use your self-hosted WireGuard as the default for daily use (stronger privacy, ad blocking, DNS control, zero cost). Switch to a commercial VPN for torrenting (shared IPs, legal protection) and geo-shifting (multiple countries).
What You Now Have
- All VPN-routed traffic encrypted — ISP sees only encrypted UDP to a German IP (subject to split tunneling)
- German exit IP — websites see a Hetzner IP, not your ISP
- Ads and trackers blocked — Pi-hole blocks often 20-40% of DNS queries across every app (varies by device and usage)
- DNS fully encrypted — Pi-hole → dnscrypt-proxy → Quad9 over DNS-over-HTTPS
- No publicly exposed web admin interfaces — admin panels closed to internet, accessible only via SSH tunnel
- Private mesh network — Tailscale connects all your devices to VPS services
- 1 TB remote storage — Hetzner Storage Box mounted as local filesystem
- 60,000+ book library — complete English-language Project Gutenberg via Kiwix
- Library search engine — advanced search with reading lists and Zotero/BibTeX export
- PDF toolkit — merge, split, compress, OCR, convert, sign — 50+ operations, self-hosted
- File sync across devices — Syncthing, no cloud storage needed
- 100 academic RSS feeds — Miniflux tracks STS, digital media, AI ethics, sociology
- Daily Telegram digest — Gemini Flash summarizes new articles as thematic analysis
- VPS health monitoring — hourly checks with Telegram alerts on issues
- Encrypted nightly backups — BorgBackup to Storage Box
- Minimal logging — volatile storage, 1-day retention, nothing on disk
- Automatic security updates — unattended-upgrades patches daily
- Total cost: ~$9–10/month — ~$5 VPS + ~$4 Storage Box
Part 3: Offline Library
What started as curiosity about Calibre-Web became something more ambitious: a self-hosted, searchable archive of the entire English-language Project Gutenberg catalog, accessible from any device on the Tailscale network. The library runs on two services — Kiwix for reading, and a custom search app for finding and exporting.
Kiwix — Reading Interface
Kiwix serves ZIM files (compressed, indexed web archives) through a browser. It was built to make Wikipedia available offline — it has since been deployed in refugee camps, schools across sub-Saharan Africa, and smuggling operations into North Korea. Here it serves 60,000+ public domain books.
Docker Setup
mkdir -p ~/kiwix
nano ~/kiwix/docker-compose.yml
services:
kiwix:
image: ghcr.io/kiwix/kiwix-serve
container_name: kiwix
volumes:
- /mnt/storagebox/kiwix:/data:ro
ports:
- "127.0.0.1:8888:8080"
- "YOUR_TAILSCALE_IP:8888:8080"
restart: unless-stopped
command: /data/*.zim
The *.zim glob serves every ZIM file in the directory. Adding files requires a restart to re-expand the glob.
cd ~/kiwix && docker compose up -d
Access: http://YOUR_TAILSCALE_IP:8888
The Collection
The complete English-language Gutenberg catalog, organized by Library of Congress Classification. 30 ZIM files, ~200 GB total, stored on the Storage Box.
| LCC | Subject | Size | LCC | Subject | Size | |
|---|---|---|---|---|---|---|
| A | General Works | 9.1G | N | Fine Arts | 21G | |
| B | Philosophy | 5.9G | P–PZ | Literature (all sub-codes) | ~60G | |
| C | Aux. History | 1.2G | Q | Science | 16G | |
| D | World History | 37G | R | Medicine | 1.8G | |
| E | Americas History | 9.4G | S | Agriculture | 4.2G | |
| F | Americas (Local) | 9.1G | T | Technology | 12G | |
| G | Geography/Anthro | 7.5G | U | Military Science | 1.2G | |
| H | Social Sciences | 4.2G | V | Naval Science | 1.2G | |
| J | Political Science | 434M | Z | Bibliography | 2.5G | |
| K | Law | 233M | ||||
| L | Education | 578M | ||||
| M | Music | 3.7G |
All dated December 2025. Updates are infrequent — checking every two to three years is sufficient.
Downloading ZIM Files
Files are downloaded directly to the Storage Box from download.kiwix.org. For bulk downloads, use a script with wget -c (resume-capable) and nohup to survive SSH disconnects:
echo "y" | nohup bash ~/download_gutenberg_zims.sh > ~/gutenberg_download.log 2>&1 &
After downloading, restart Kiwix:
cd ~/kiwix && docker compose restart
Verify: ls -lh /mnt/storagebox/kiwix/gutenberg_*.zim | wc -l (should show 30).
Updating
The community kiwix-zim-updater script checks for newer versions and downloads only updated files:
git clone https://github.com/jojo2357/kiwix-zim-updater.git
./kiwix-zim-updater/kiwix-zim-updater.sh -d /mnt/storagebox/kiwix/
Note: no incremental updates exist for ZIM files. Each update is a full re-download of the changed file.
Gutenberg Search — Discovery Interface
A self-hosted search app that indexes Gutenberg’s catalog in SQLite FTS5 and serves a web UI. Solves a problem Kiwix doesn’t: searching across all 30 ZIM files by author, title, subject, LCC code, and language simultaneously.
What It Does
- Full-text search with BM25 relevance ranking
- Advanced search: author, title, subject, LCC, language — any combination
- “Read in Kiwix” links open books directly in the Kiwix instance
- EPUB download links for each book
- Export as RIS (for Zotero) or BibTeX (for LaTeX/Overleaf)
- Bulk export all results from a search in one click
- Named reading lists that persist across sessions
- Health check endpoint at
/api/health
Install
cd ~
tar -xzf gutenberg-search.tar.gz
cd gutenberg-search
docker compose up -d --build
First startup downloads the catalog (~14 MB) and builds the SQLite index (~1 minute). Check:
curl http://127.0.0.1:8585/api/health
# {"books_indexed":76645,"ready":true,"status":"healthy"}
Access: http://YOUR_TAILSCALE_IP:8585
File Structure
~/gutenberg-search/
├── Dockerfile # Python 3.12-slim + HEALTHCHECK
├── docker-compose.yml
├── requirements.txt # flask, gunicorn, flask-cors, flask-limiter
├── app.py # Routes, search, reading lists
├── exporters.py # RIS and BibTeX formatting
└── static/
└── index.html # Frontend
/data/ (Docker volume, persistent):
├── pg_catalog.csv # Cached catalog (auto-downloaded, refreshes after 30 days)
├── catalog.sqlite # FTS5 index (auto-built)
└── reading_lists.json # Saved reading lists
Refreshing the Catalog
docker exec gutenberg-search rm /data/pg_catalog.csv /data/catalog.sqlite
cd ~/gutenberg-search && docker compose restart
Note on Book Counts
The search indexes the full multilingual Gutenberg catalog (~76,000 items). The Kiwix ZIM files contain only English-language texts (~60,000). Some search results may not have corresponding books in Kiwix.
Health Monitoring
Both services are checked by the existing vps-health-monitor.py. Three additions were made:
- Docker stacks —
kiwix,gutenberg-search, andstirling-pdfcontainers added toDOCKER_STACKS - HTTP health check — queries
/api/healthon the search app - Checks list —
("Gutenberg", check_gutenberg_search)added tomain()
The daily Telegram summary now includes:
kiwix: running
gutenberg-search: running
stirling-pdf: running
Gutenberg Search: healthy (76645 books)
Part 4: Miniflux RSS Reader
A self-hosted RSS reader running alongside your existing Docker Compose VPN stack (WireGuard, Pi-hole, dnscrypt, Syncthing). Miniflux runs as a separate Docker Compose stack in ~/miniflux.
Storage Impact
| Component | Size | Notes |
|---|---|---|
| Docker images (miniflux + postgres) | ~100 MB | One-time |
| PostgreSQL database (year 1) | 200–400 MB | With cleanup policy below |
| Total realistic | ~300–500 MB | After a full year of 100 feeds |
The database grows at roughly 1–2 MB/day with 100 feeds. Miniflux’s built-in cleanup keeps it bounded. Negligible on a CX22 with 40 GB.
Step 1: Create the Miniflux directory
mkdir -p ~/miniflux
cd ~/miniflux
Step 2: Generate a strong database password
Use hex encoding to avoid special characters that break the PostgreSQL connection URL:
openssl rand -hex 24
Copy the output. Do NOT use openssl rand -base64 — characters like / and + cause URL parsing errors in the DATABASE_URL.
Step 3: Create docker-compose.yml
nano ~/miniflux/docker-compose.yml
Paste this (replace YOUR_DB_PASSWORD with the password from step 2):
services:
miniflux:
image: miniflux/miniflux:latest
container_name: miniflux
restart: unless-stopped
depends_on:
db:
condition: service_healthy
ports:
- "127.0.0.1:8090:8080"
- "YOUR_TAILSCALE_IP:8090:8080"
environment:
- DATABASE_URL=postgres://miniflux:YOUR_DB_PASSWORD@db/miniflux?sslmode=disable
- RUN_MIGRATIONS=1
- CREATE_ADMIN=1
- ADMIN_USERNAME=YOUR_USERNAME
- ADMIN_PASSWORD=PICK_A_STRONG_PASSWORD
- CLEANUP_ARCHIVE_READ_DAYS=120
- CLEANUP_ARCHIVE_UNREAD_DAYS=140
- POLLING_FREQUENCY=60
- BATCH_SIZE=25
- POLLING_PARSING_ERROR_LIMIT=0
- METRICS_COLLECTOR=false
db:
image: postgres:16-alpine
container_name: miniflux-db
restart: unless-stopped
environment:
- POSTGRES_USER=miniflux
- POSTGRES_PASSWORD=YOUR_DB_PASSWORD
- POSTGRES_DB=miniflux
volumes:
- miniflux-db:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "miniflux"]
interval: 10s
start_period: 30s
volumes:
miniflux-db:
Before saving, replace:
YOUR_DB_PASSWORD(appears twice — in DATABASE_URL and POSTGRES_PASSWORD) → the hex password from step 2. Both values must be identical.PICK_A_STRONG_PASSWORD→ your Miniflux login password
What the settings do
| Setting | Value | Meaning |
|---|---|---|
127.0.0.1:8090:8080 + YOUR_TAILSCALE_IP:8090:8080 | Binds to localhost and Tailscale | Accessible via SSH tunnel or Tailscale mesh |
CLEANUP_ARCHIVE_READ_DAYS=120 | 120 | Read articles deleted after 120 days |
CLEANUP_ARCHIVE_UNREAD_DAYS=140 | 140 | Unread articles deleted after ~5 months |
POLLING_FREQUENCY=60 | 60 min | Checks each feed every 60 minutes |
BATCH_SIZE=25 | 25 | Checks 25 feeds per polling cycle |
POLLING_PARSING_ERROR_LIMIT=0 | 0 | Never stops checking a feed after errors |
Step 4: Start Miniflux
cd ~/miniflux
docker compose up -d
Wait ~30 seconds for PostgreSQL to initialize, then check:
docker compose ps
Both miniflux and miniflux-db should show Up (healthy).
Secure the Compose file (it contains your database password and admin credentials):
chmod 600 ~/miniflux/docker-compose.yml
If miniflux shows Restarting, check logs:
docker logs miniflux
Common issues:
- “password authentication failed” → the password doesn’t match between the two services
- “invalid port … after host” → your password contains special characters. Regenerate with
openssl rand -hex 24, then:docker compose down,docker volume rm miniflux_miniflux-db, edit the password,docker compose up -d - “role does not exist” → the db container hasn’t finished initializing. Wait 30 seconds.
Step 5: Access Miniflux via SSH tunnel
From your Mac terminal (not the VPS SSH session):
ssh -L 8090:127.0.0.1:8090 YOUR_USER@YOUR_VPS_IP -N
Leave that running. Open your browser to http://localhost:8090.
Log in with username you configured and the password you set.
Step 6: Disable auto-admin creation
After first login, edit the compose file:
nano ~/miniflux/docker-compose.yml
Change:
- CREATE_ADMIN=1
- ADMIN_USERNAME=YOUR_USERNAME
- ADMIN_PASSWORD=PICK_A_STRONG_PASSWORD
To:
- CREATE_ADMIN=0
Then: docker compose down && docker compose up -d
Step 7: Import feeds via OPML
The feed list is provided as a separate OPML file (feeds.opml) with 97 feeds across 11 categories. With the 3 Google Scholar alerts from Step 8, the total is 100 feeds.
Transfer the OPML to your Mac:
scp YOUR_USER@YOUR_VPS_IP:~/miniflux/feeds.opml ~/Downloads/feeds.opmlOpen Miniflux at http://localhost:8090
Go to Settings → Import → upload the OPML file
All 97 feeds import with their categories intact.
Step 8: Add Google Scholar alerts
Set up 3 alerts separately:
- Go to https://scholar.google.com/scholar_alerts
- Create alerts (your name, key research terms, co-authors)
- In each alert’s settings, choose RSS feed (not email)
- Copy the feed URL → add in Miniflux via Feeds → Add Subscription
Step 9: Generate a Miniflux API key
Needed by the Telegram automation scripts.
- In Miniflux: Settings → API Keys
- Click Create a new API key, name it “scripts”
- Copy the key (shown only once)
Step 10: Add to BorgBackup
nano ~/backup.sh
Add a PostgreSQL dump before the borg create command:
docker exec miniflux-db pg_dump -U miniflux miniflux > ~/miniflux/db-backup.sql
Add ~/miniflux to the list of backed-up paths.
Daily Workflow
Accessing Miniflux
ssh -L 8090:127.0.0.1:8090 YOUR_USER@YOUR_VPS_IP -N
Then open http://localhost:8090.
Keyboard shortcuts
| Key | Action |
|---|---|
g u | Go to unread |
g b | Go to bookmarks |
j / k | Next / previous |
v | Open original in new tab |
d | Mark as read/unread |
s | Star / bookmark |
Shift+A | Mark all as read |
f | Toggle full content fetch |
Morning routine
- Check your Telegram digest (arrives at 6:00 AM UTC — adjust to your timezone)
- Open Miniflux to read anything that caught your interest
g u→ scan unread,sto star itemsShift+A→ mark all readg b→ read starred items
Automation
Two Telegram bot scripts run from ~/miniflux/scripts/:
- Daily digest — Gemini Flash summarizes new articles as thematic analysis, sent to Telegram every morning
- Health monitor — checks containers, disk, memory, load, backup, SSH failures; alerts on issues hourly, sends daily summary
See the Telegram Automation Setup Guide for configuration.
Maintenance
Updating Miniflux
cd ~/miniflux
docker compose pull
docker compose down
docker compose up -d
Checking disk usage
docker system df
docker exec miniflux-db psql -U miniflux -c "SELECT pg_size_pretty(pg_database_size('miniflux'));"
Adjusting cleanup
Edit docker-compose.yml to tighten retention, then restart:
- CLEANUP_ARCHIVE_READ_DAYS=60 # was 120
- CLEANUP_ARCHIVE_UNREAD_DAYS=90 # was 140
If a feed breaks
Check Feeds view for error counts. Common fixes:
- 403: set custom user agent in feed settings →
Mozilla/5.0 (compatible; Miniflux) - 404: URL changed, find current RSS link on journal’s site
- Parse errors: try
atomvariant instead ofrss2or vice versa
Full VPS Service Map
| Service | Stack | Container | Port | Access |
|---|---|---|---|---|
| WireGuard VPN | ~/vpn | wg-easy | 51820/UDP (public) | WireGuard client |
| Pi-hole | ~/vpn | pihole | localhost:8080 | SSH tunnel |
| DNS encryption | ~/vpn | dnscrypt | internal | Via Pi-hole |
| Syncthing | ~/vpn | syncthing | 10.8.1.5:8384 | SSH tunnel to Docker bridge IP |
| Kiwix | ~/kiwix | kiwix | YOUR_TAILSCALE_IP:8888 | Tailscale |
| Gutenberg Search | ~/gutenberg-search | gutenberg-search | YOUR_TAILSCALE_IP:8585 | Tailscale |
| Stirling PDF | ~/stirling-pdf | stirling-pdf | YOUR_TAILSCALE_IP:8484 | Tailscale |
| Miniflux | ~/miniflux | miniflux | YOUR_TAILSCALE_IP:8090 | Tailscale / SSH tunnel |
| PostgreSQL | ~/miniflux | miniflux-db | internal | Via Miniflux |
Cron jobs
| Job | Schedule | Description |
|---|---|---|
| BorgBackup | 3:00 AM UTC | Nightly backup to Hetzner Storage Box |
| RSS digest | 6:00 AM UTC | Telegram thematic digest |
| Health check | Every hour | Alerts only on problems |
| Health summary | 7:00 AM UTC | Daily all-clear report |
Part 5: Telegram Automation
Two scripts that use your Miniflux RSS reader and a Telegram bot to keep you informed:
- Daily Digest (
miniflux-telegram-digest.py) — thematic analysis of new articles via Gemini Flash - VPS Health Monitor (
vps-health-monitor.py) — alerts on infrastructure issues
Both use only outbound connections. No ports opened, no domains needed, no new Docker containers.
Architecture
┌───────────┐
outbound │ Gemini │
┌────────────────►│ Flash │
┌─────────────┐ localhost │ │ (free) │
│ Miniflux │◄─────────┐ │ └───────────┘
│ (Docker) │ │ │
│ 100 feeds │ ┌────┴──┴───────┐ outbound ┌───────────┐
└─────────────┘ │ Cron scripts │────────────►│ Telegram │
│ │ │ Bot API │
┌─────────────┐ │ - digest.py │ └─────┬─────┘
│ System │◄────│ - health.py │ │
│ (disk/mem/ │ └───────────────┘ ┌─────▼─────┐
│ docker) │ │ Your │
└─────────────┘ │ iPhone │
└───────────┘
Prerequisites
- Miniflux running with feeds imported (see Miniflux Setup Guide)
- Miniflux API key (Settings → API Keys)
- Telegram account
Step 1: Create the Telegram Bot
- Open Telegram → search for @BotFather → start a chat
- Send
/newbot - Choose a display name (e.g., “VPS Bot”) and username (must end in
bot) - BotFather replies with your bot token — copy it
Step 2: Get Your Chat ID
- Open a chat with your new bot and send it any message (e.g., “hi”) — this is required before the bot can message you
- Either:
- Search for @userinfobot in Telegram and message it — it replies with your ID
- Or open
https://api.telegram.org/bot<YOUR_TOKEN>/getUpdatesin a browser and find"chat":{"id":123456789}
Step 3: Get Your Gemini API Key
- Go to https://aistudio.google.com/apikey
- Click Create API Key
- Copy it
Free tier limits change periodically — check current quotas at ai.google.dev. You’ll use 1 request per day, well within any reasonable free tier.
Step 4: Create the Scripts
On your VPS, create the scripts directory and both scripts:
mkdir -p ~/miniflux/scripts
Daily Digest Script
nano ~/miniflux/scripts/miniflux-telegram-digest.py
Paste the full script:
#!/usr/bin/env python3
"""
miniflux-telegram-digest.py
Daily digest: pulls new Miniflux entries, summarizes via LLM, sends via Telegram bot.
Usage: python3 miniflux-telegram-digest.py
Cron: 0 6 * * * cd ~/miniflux && ./scripts/run.sh miniflux-telegram-digest.py
Environment variables (set in ~/miniflux/scripts/.env):
MINIFLUX_URL - default http://127.0.0.1:8090
MINIFLUX_API_KEY - required
LLM_PROVIDER - "claude" | "gemini" | "openai" (default: gemini)
GEMINI_API_KEY - if using gemini
ANTHROPIC_API_KEY - if using claude
OPENAI_API_KEY - if using openai
DIGEST_DAYS_BACK - how far back to look (default: 1)
DIGEST_MAX_ENTRIES - max entries to summarize (default: 80)
TELEGRAM_BOT_TOKEN - from @BotFather
TELEGRAM_CHAT_ID - your personal chat ID
"""
import json
import os
import sys
import re
import urllib.request
import urllib.error
from datetime import datetime, timezone, timedelta
from collections import defaultdict
# ── Configuration ────────────────────────────────────────────────────────────
MINIFLUX_URL = os.environ.get("MINIFLUX_URL", "http://127.0.0.1:8090")
MINIFLUX_API_KEY = os.environ.get("MINIFLUX_API_KEY", "")
LLM_PROVIDER = os.environ.get("LLM_PROVIDER", "gemini")
DAYS_BACK = int(os.environ.get("DIGEST_DAYS_BACK", "1"))
MAX_ENTRIES = int(os.environ.get("DIGEST_MAX_ENTRIES", "80"))
TELEGRAM_BOT_TOKEN = os.environ.get("TELEGRAM_BOT_TOKEN", "")
TELEGRAM_CHAT_ID = os.environ.get("TELEGRAM_CHAT_ID", "")
STATE_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), ".digest-state.json")
TG_MAX_LEN = 4000
TARGET_LEN = 10000
# ─────────────────────────────────────────────────────────────────────────────
# ── Miniflux API ─────────────────────────────────────────────────────────────
def miniflux_api(endpoint):
url = f"{MINIFLUX_URL}/v1{endpoint}"
req = urllib.request.Request(url)
req.add_header("X-Auth-Token", MINIFLUX_API_KEY)
with urllib.request.urlopen(req, timeout=30) as resp:
return json.loads(resp.read().decode())
def get_new_entries():
cutoff = datetime.now(timezone.utc) - timedelta(days=DAYS_BACK)
cutoff_unix = int(cutoff.timestamp())
entries = []
offset = 0
limit = 100
while len(entries) < MAX_ENTRIES:
data = miniflux_api(
f"/entries?order=published_at&direction=desc"
f"&limit={limit}&offset={offset}"
f"&after={cutoff_unix}"
)
batch = data.get("entries", [])
if not batch:
break
entries.extend(batch)
offset += limit
if len(batch) < limit:
break
return entries[:MAX_ENTRIES]
def get_categories():
return {c["id"]: c["title"] for c in miniflux_api("/categories")}
def get_feeds():
return {f["id"]: {"title": f["title"], "category_id": f["category"]["id"]}
for f in miniflux_api("/feeds")}
# ── State Management ─────────────────────────────────────────────────────────
def load_state():
if os.path.exists(STATE_FILE):
with open(STATE_FILE) as f:
return json.load(f)
return {"last_entry_ids": []}
def save_state(entry_ids):
with open(STATE_FILE, "w") as f:
json.dump({"last_entry_ids": entry_ids}, f)
def filter_new(entries, state):
seen = set(state.get("last_entry_ids", []))
return [e for e in entries if e["id"] not in seen]
# ── LLM Providers ───────────────────────────────────────────────────────────
def strip_html(html):
text = re.sub(r'<[^>]+>', ' ', html or '')
text = re.sub(r'\s+', ' ', text).strip()
return text[:1500]
def build_prompt(entries_by_category, total_count):
today = datetime.now(timezone.utc).strftime("%A, %B %d, %Y")
lines = [
f"You are a research assistant for a scholar specializing in Science and Technology Studies (STS), "
f"digital infrastructure, algorithms, and emerging technologies. "
f"Today is {today}. Below are {total_count} new articles from academic RSS feeds, grouped by category.",
"",
"Write a THEMATIC ANALYSIS of what's happening in these feeds — not a list of articles. "
"Structure your analysis as follows:",
"",
"OPENING (2-3 sentences): What are the dominant themes or threads across today's articles? "
"What would an STS scholar find most interesting?",
"",
"THEMATIC SECTIONS (3-5 sections): Identify the key themes or conversations emerging "
"across the articles. Each section should:",
" - Have a descriptive thematic header (e.g., 'Algorithmic governance under scrutiny' "
" not 'STS Journals')",
" - Synthesize what 2-5 articles collectively tell us about that theme",
" - Name specific articles and their sources in parentheses when referencing them",
" - IMPORTANT: Immediately after mentioning each article, paste its full URL on the next line. "
"The URL is provided in the data below for each article. This is critical — the reader needs clickable links.",
" - Explain why this matters or what's at stake — connect to broader STS debates",
" - Be 3-5 sentences long",
"",
"QUICK MENTIONS (end): Briefly note any remaining articles that don't fit the themes "
"above — just title and source, 1 line each.",
"",
"STYLE RULES:",
"- Tone: an informed colleague who reads widely and thinks critically — not a news ticker",
"- Favor analysis over description: 'These three papers converge on...' not 'This paper is about...'",
"- Make connections between articles in different categories when relevant",
"- Plain text only — no markdown, no HTML, no asterisks for bold/italic",
"- Use line breaks and blank lines between sections for readability",
f"- Target length: {TARGET_LEN} characters (roughly 1500-2000 words). Use the space.",
"- Include URLs for articles you discuss in the thematic sections, but NOT for quick mentions",
"- Format each referenced article as: title (source) followed by its URL on the next line",
"",
"---",
"",
]
for cat_name, entries in entries_by_category.items():
if not entries:
continue
lines.append(f"### Category: {cat_name}")
for e in entries:
title = e.get("title", "Untitled")
url = e.get("url", "")
feed = e.get("_feed_title", "")
content = strip_html(e.get("content", ""))
lines.append(f"\nTitle: {title}")
lines.append(f"Source: {feed}")
lines.append(f"URL: {url}")
if content:
lines.append(f"Content excerpt: {content[:800]}")
lines.append("")
return "\n".join(lines)
def llm_claude(prompt):
api_key = os.environ.get("ANTHROPIC_API_KEY", "")
if not api_key:
raise RuntimeError("ANTHROPIC_API_KEY not set")
body = json.dumps({
"model": "claude-haiku-4-5-20251001",
"max_tokens": 4096,
"messages": [{"role": "user", "content": prompt}]
}).encode()
req = urllib.request.Request(
"https://api.anthropic.com/v1/messages",
data=body,
headers={
"Content-Type": "application/json",
"x-api-key": api_key,
"anthropic-version": "2023-06-01",
},
method="POST"
)
with urllib.request.urlopen(req, timeout=120) as resp:
data = json.loads(resp.read().decode())
return data["content"][0]["text"]
def llm_gemini(prompt):
api_key = os.environ.get("GEMINI_API_KEY", "")
if not api_key:
raise RuntimeError("GEMINI_API_KEY not set")
body = json.dumps({
"contents": [{"parts": [{"text": prompt}]}],
"generationConfig": {"maxOutputTokens": 4096}
}).encode()
url = (
f"https://generativelanguage.googleapis.com/v1beta/models/"
f"gemini-2.5-flash:generateContent?key={api_key}"
)
req = urllib.request.Request(
url, data=body,
headers={"Content-Type": "application/json"},
method="POST"
)
with urllib.request.urlopen(req, timeout=120) as resp:
data = json.loads(resp.read().decode())
return data["candidates"][0]["content"]["parts"][0]["text"]
def llm_openai(prompt):
api_key = os.environ.get("OPENAI_API_KEY", "")
if not api_key:
raise RuntimeError("OPENAI_API_KEY not set")
body = json.dumps({
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": prompt}],
"max_tokens": 4096,
}).encode()
req = urllib.request.Request(
"https://api.openai.com/v1/chat/completions",
data=body,
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}",
},
method="POST"
)
with urllib.request.urlopen(req, timeout=120) as resp:
data = json.loads(resp.read().decode())
return data["choices"][0]["message"]["content"]
LLM_DISPATCH = {
"claude": llm_claude,
"gemini": llm_gemini,
"openai": llm_openai,
}
# ── Telegram ─────────────────────────────────────────────────────────────────
def send_telegram(text):
"""Send message via Telegram Bot API. Splits if over 4096 chars."""
chunks = []
while len(text) > TG_MAX_LEN:
split_at = text.rfind("\n", 0, TG_MAX_LEN)
if split_at == -1:
split_at = TG_MAX_LEN
chunks.append(text[:split_at])
text = text[split_at:].lstrip("\n")
chunks.append(text)
for i, chunk in enumerate(chunks):
body = json.dumps({
"chat_id": TELEGRAM_CHAT_ID,
"text": chunk,
"disable_web_page_preview": True,
}).encode()
url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage"
req = urllib.request.Request(
url, data=body,
headers={"Content-Type": "application/json"},
method="POST"
)
with urllib.request.urlopen(req, timeout=30) as resp:
result = json.loads(resp.read().decode())
if not result.get("ok"):
raise RuntimeError(f"Telegram API error: {result}")
print(f"Sent message {i+1}/{len(chunks)} ({len(chunk)} chars)")
def send_error_notification(error_msg):
"""Send failure alert via Telegram."""
try:
text = (
f"⚠️ Digest failed\n\n"
f"Error: {error_msg}\n\n"
f"Check: tail -50 ~/miniflux/scripts/digest.log"
)
send_telegram(text)
except Exception:
pass
# ── Main ─────────────────────────────────────────────────────────────────────
def main():
if not MINIFLUX_API_KEY:
print("Error: Set MINIFLUX_API_KEY", file=sys.stderr)
sys.exit(1)
if not TELEGRAM_BOT_TOKEN or not TELEGRAM_CHAT_ID:
print("Error: Set TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID", file=sys.stderr)
sys.exit(1)
if LLM_PROVIDER not in LLM_DISPATCH:
print(f"Error: LLM_PROVIDER must be one of: {', '.join(LLM_DISPATCH.keys())}",
file=sys.stderr)
sys.exit(1)
try:
state = load_state()
print(f"Fetching entries from last {DAYS_BACK} day(s)...")
categories = get_categories()
feeds = get_feeds()
entries = get_new_entries()
print(f"Fetched {len(entries)} entries total")
new_entries = filter_new(entries, state)
if not new_entries:
print("No new entries since last run. Skipping.")
sys.exit(0)
print(f"{len(new_entries)} new entries to process")
# Group by category
entries_by_category = defaultdict(list)
for entry in new_entries:
feed_info = feeds.get(entry.get("feed_id"), {})
cat_id = feed_info.get("category_id", 0)
cat_name = categories.get(cat_id, "Uncategorized")
entry["_feed_title"] = feed_info.get("title", "Unknown")
entries_by_category[cat_name].append(entry)
# Build prompt and call LLM
prompt = build_prompt(entries_by_category, len(new_entries))
print(f"Prompt: {len(prompt)} chars, calling {LLM_PROVIDER}...")
llm_fn = LLM_DISPATCH[LLM_PROVIDER]
summary = llm_fn(prompt)
print(f"Got {len(summary)} char summary")
# Add footer
footer = (
f"\n\n—\n"
f"{len(new_entries)} articles · {len(entries_by_category)} categories · "
f"{LLM_PROVIDER} summary"
)
full_message = summary + footer
# Send via Telegram
send_telegram(full_message)
# Save state
all_ids = [e["id"] for e in entries]
save_state(all_ids)
print("State saved. Done.")
except Exception as e:
print(f"FATAL: {e}", file=sys.stderr)
import traceback
traceback.print_exc(file=sys.stderr)
send_error_notification(str(e))
sys.exit(1)
if __name__ == "__main__":
main()
Health Monitor Script
nano ~/miniflux/scripts/vps-health-monitor.py
Paste the full script:
#!/usr/bin/env python3
"""
vps-health-monitor.py
Checks VPS health and sends Telegram alerts on issues.
Runs via cron every hour. Only messages you when something is wrong,
plus an optional daily summary.
Usage: python3 vps-health-monitor.py # alert-only mode
python3 vps-health-monitor.py --daily # daily summary
Environment variables (from ~/miniflux/scripts/.env):
TELEGRAM_BOT_TOKEN - required
TELEGRAM_CHAT_ID - required
Cron:
0 * * * * ~/miniflux/scripts/run.sh vps-health-monitor.py >> ~/miniflux/scripts/health.log 2>&1
0 7 * * * ~/miniflux/scripts/run.sh vps-health-monitor.py --daily >> ~/miniflux/scripts/health.log 2>&1
"""
import json
import os
import sys
import subprocess
import urllib.request
from datetime import datetime, timezone, timedelta
from pathlib import Path
TELEGRAM_BOT_TOKEN = os.environ.get("TELEGRAM_BOT_TOKEN", "")
TELEGRAM_CHAT_ID = os.environ.get("TELEGRAM_CHAT_ID", "")
# ── Thresholds ───────────────────────────────────────────────────────────────
DISK_WARN_PERCENT = 80
DISK_CRIT_PERCENT = 90
MEMORY_WARN_PERCENT = 85
LOAD_WARN_MULTIPLIER = 2.0
BACKUP_MAX_AGE_HOURS = 36
# ─────────────────────────────────────────────────────────────────────────────
# Docker Compose stacks to check: (name, path, expected containers)
DOCKER_STACKS = [
("VPN stack", "~/vpn", ["wg-easy", "pihole", "dnscrypt", "syncthing"]),
("Miniflux stack", "~/miniflux", ["miniflux", "miniflux-db"]),
("Kiwix", "~/kiwix", ["kiwix"]),
("Gutenberg Search", "~/gutenberg-search", ["gutenberg-search"]),
("Stirling PDF", "~/stirling-pdf", ["stirling-pdf"]),
]
def run(cmd, timeout=10):
"""Run shell command, return stdout or None on failure."""
try:
result = subprocess.run(
cmd, shell=True, capture_output=True, text=True, timeout=timeout
)
return result.stdout.strip()
except Exception:
return None
def check_docker_containers():
"""Check that expected Docker containers are running."""
issues = []
info = []
running = run("docker ps --format '{{.Names}}'")
if running is None:
return ["Could not query Docker — is the daemon running?"], []
running_set = set(running.split("\n")) if running else set()
for stack_name, stack_path, expected in DOCKER_STACKS:
for container in expected:
if container in running_set:
info.append(f"{container}: running")
else:
issues.append(f"{container} ({stack_name}): NOT RUNNING")
return issues, info
def check_disk():
"""Check disk usage."""
issues = []
info = []
output = run("df -h / --output=pcent,size,used,avail | tail -1")
if not output:
return ["Could not check disk usage"], []
parts = output.split()
percent = int(parts[0].replace("%", ""))
size_output = run("df -h / --output=size,used,avail | tail -1")
size_parts = size_output.split() if size_output else ["?", "?", "?"]
info.append(f"Disk: {percent}% used ({size_parts[1]}B / {size_parts[0]}B, {size_parts[2]}B free)")
if percent >= DISK_CRIT_PERCENT:
issues.append(f"CRITICAL: Disk at {percent}% — only {size_parts[2]}B free")
elif percent >= DISK_WARN_PERCENT:
issues.append(f"WARNING: Disk at {percent}% — {size_parts[2]}B free")
return issues, info
def check_memory():
"""Check RAM usage."""
issues = []
info = []
output = run("free -m | grep Mem")
if not output:
return ["Could not check memory"], []
parts = output.split()
total = int(parts[1])
used = int(parts[2])
available = int(parts[6])
percent = round((used / total) * 100)
info.append(f"Memory: {percent}% used ({used}MB / {total}MB, {available}MB available)")
if percent >= MEMORY_WARN_PERCENT:
issues.append(f"WARNING: Memory at {percent}% — {available}MB available")
return issues, info
def check_load():
"""Check system load average."""
issues = []
info = []
load_str = run("cat /proc/loadavg")
cpu_str = run("nproc")
if not load_str or not cpu_str:
return ["Could not check load"], []
load_1, load_5, load_15 = [float(x) for x in load_str.split()[:3]]
cpus = int(cpu_str)
info.append(f"Load: {load_1:.1f} / {load_5:.1f} / {load_15:.1f} (1/5/15 min, {cpus} cores)")
if load_5 > cpus * LOAD_WARN_MULTIPLIER:
issues.append(f"WARNING: Load average {load_5:.1f} exceeds {cpus * LOAD_WARN_MULTIPLIER:.0f} (5 min)")
return issues, info
def check_backup():
"""Check BorgBackup recency via backup log modification time."""
issues = []
info = []
log_path = os.path.expanduser("~/backup.log")
if os.path.exists(log_path):
stat = os.stat(log_path)
mtime = datetime.fromtimestamp(stat.st_mtime, tz=timezone.utc)
age = datetime.now(timezone.utc) - mtime
hours_ago = age.total_seconds() / 3600
info.append(f"Backup log last modified: {hours_ago:.0f}h ago ({mtime.strftime('%b %d %H:%M UTC')})")
if hours_ago > BACKUP_MAX_AGE_HOURS:
issues.append(f"WARNING: Last backup log update was {hours_ago:.0f}h ago (threshold: {BACKUP_MAX_AGE_HOURS}h)")
else:
info.append("Backup: no backup.log found — has the backup script ever run?")
return issues, info
def check_ssh_failures():
"""Check for recent SSH brute force attempts."""
issues = []
info = []
count_str = run("journalctl -u ssh --since '24 hours ago' 2>/dev/null | grep -c 'Failed password' || echo 0")
if count_str and count_str.isdigit():
count = int(count_str)
info.append(f"Failed SSH logins (24h): {count}")
if count > 100:
issues.append(f"WARNING: {count} failed SSH attempts in 24h — check fail2ban")
else:
count_str = run("grep -c 'Failed password' /var/log/auth.log 2>/dev/null || echo 0")
if count_str and count_str.isdigit():
info.append(f"Failed SSH logins (auth.log): {count_str}")
return issues, info
def check_uptime():
"""Get system uptime."""
output = run("uptime -p")
return [], [f"Uptime: {output}"] if output else []
def send_telegram(text):
"""Send message via Telegram Bot API."""
TG_MAX = 4000
chunks = []
while len(text) > TG_MAX:
split_at = text.rfind("\n", 0, TG_MAX)
if split_at == -1:
split_at = TG_MAX
chunks.append(text[:split_at])
text = text[split_at:].lstrip("\n")
chunks.append(text)
for chunk in chunks:
body = json.dumps({
"chat_id": TELEGRAM_CHAT_ID,
"text": chunk,
"disable_web_page_preview": True,
}).encode()
url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage"
req = urllib.request.Request(
url, data=body,
headers={"Content-Type": "application/json"},
method="POST"
)
with urllib.request.urlopen(req, timeout=30) as resp:
result = json.loads(resp.read().decode())
if not result.get("ok"):
raise RuntimeError(f"Telegram error: {result}")
def main():
if not TELEGRAM_BOT_TOKEN or not TELEGRAM_CHAT_ID:
print("Error: Set TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID", file=sys.stderr)
sys.exit(1)
daily_mode = "--daily" in sys.argv
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
all_issues = []
all_info = []
checks = [
("Docker", check_docker_containers),
("Disk", check_disk),
("Memory", check_memory),
("Load", check_load),
("Backup", check_backup),
("SSH", check_ssh_failures),
("Uptime", check_uptime),
]
for name, check_fn in checks:
try:
issues, info = check_fn()
all_issues.extend(issues)
all_info.extend(info)
except Exception as e:
all_issues.append(f"{name} check failed: {e}")
has_issues = len(all_issues) > 0
if has_issues:
lines = [f"🚨 VPS Health Alert — {now}", ""]
for issue in all_issues:
lines.append(f" {issue}")
lines.append("")
lines.append("Full status:")
for info_line in all_info:
lines.append(f" {info_line}")
send_telegram("\n".join(lines))
print(f"[{now}] ALERT sent: {len(all_issues)} issue(s)")
elif daily_mode:
lines = [f"✅ VPS Health — {now}", ""]
for info_line in all_info:
lines.append(f" {info_line}")
send_telegram("\n".join(lines))
print(f"[{now}] Daily summary sent: all clear")
else:
print(f"[{now}] OK — no issues")
if __name__ == "__main__":
main()
Make both scripts executable:
chmod +x ~/miniflux/scripts/miniflux-telegram-digest.py
chmod +x ~/miniflux/scripts/vps-health-monitor.py
Step 5: Create the Environment File
On your VPS:
mkdir -p ~/miniflux/scripts
nano ~/miniflux/scripts/.env
# Miniflux
MINIFLUX_URL=http://127.0.0.1:8090
MINIFLUX_API_KEY=your-miniflux-api-key
# LLM
LLM_PROVIDER=gemini
GEMINI_API_KEY=your-gemini-api-key
# Telegram
TELEGRAM_BOT_TOKEN=your-bot-token
TELEGRAM_CHAT_ID=your-chat-id
# Digest settings
DIGEST_DAYS_BACK=1
DIGEST_MAX_ENTRIES=80
Lock it down:
chmod 600 ~/miniflux/scripts/.env
Step 6: Create the Wrapper Script
cat > ~/miniflux/scripts/run.sh << 'EOF'
#!/bin/bash
set -a
source "$(dirname "$0")/.env"
set +a
python3 "$(dirname "$0")/$1"
EOF
chmod +x ~/miniflux/scripts/run.sh
Step 7: Test Both Scripts
Test the digest
~/miniflux/scripts/run.sh miniflux-telegram-digest.py
Expected output:
Fetching entries from last 1 day(s)...
Fetched 80 entries total
80 new entries to process
Prompt: 37132 chars, calling gemini...
Got 8234 char summary
Sent message 1/3 (3842 chars)
Sent message 2/3 (3911 chars)
Sent message 3/3 (1204 chars)
State saved. Done.
You should receive 2-3 Telegram messages with a thematic analysis.
Test the health monitor
~/miniflux/scripts/run.sh vps-health-monitor.py --daily
You should receive a ✅ status summary showing all containers, disk, memory, load, and backup status.
Step 8: Set Up Cron
crontab -e
Add these lines:
# Daily reading digest at 6:00 AM UTC
0 6 * * * ~/miniflux/scripts/run.sh miniflux-telegram-digest.py >> ~/miniflux/scripts/digest.log 2>&1
# Hourly health check — alerts only on problems
0 * * * * ~/miniflux/scripts/run.sh vps-health-monitor.py >> ~/miniflux/scripts/health.log 2>&1
# Daily health summary at 7:00 AM UTC
0 7 * * * ~/miniflux/scripts/run.sh vps-health-monitor.py --daily >> ~/miniflux/scripts/health.log 2>&1
# Weekly log rotation — keep last 1000 lines of health, 500 of digest and backup
0 0 * * 0 tail -500 ~/miniflux/scripts/health.log > ~/miniflux/scripts/health.log.tmp && mv ~/miniflux/scripts/health.log.tmp ~/miniflux/scripts/health.log
0 0 * * 0 tail -200 ~/miniflux/scripts/digest.log > ~/miniflux/scripts/digest.log.tmp && mv ~/miniflux/scripts/digest.log.tmp ~/miniflux/scripts/digest.log
0 0 * * 0 tail -200 ~/backup.log > ~/backup.log.tmp && mv ~/backup.log.tmp ~/backup.log
What You’ll Get
Daily Digest
A 2-3 message thematic analysis, not an article list. Example (hypothetical — the papers, titles, and DOIs below are fabricated to illustrate the format):
Across today's 47 new articles, three threads stand out: a growing
conversation about algorithmic accountability in public institutions,
renewed attention to infrastructure breakdowns in the Global South,
and a methodological debate about ethnographic access in corporate
AI labs.
ALGORITHMIC GOVERNANCE UNDER PRESSURE
Two papers converge on the gap between accountability frameworks
and actual practice. "Auditing Automated Decisions in Welfare"
(Big Data & Society) traces how Dutch municipalities adopted
algorithmic risk scoring while systematically avoiding the oversight
mechanisms meant to accompany it.
https://journals.sagepub.com/doi/full/10.1177/...
This resonates with "The Transparency Trap" (Science, Technology,
& Human Values), which argues that mandated explainability
requirements often produce legibility for regulators rather than
meaningful accountability for affected populations.
https://journals.sagepub.com/doi/full/10.1177/...
...
QUICK MENTIONS
"Viral Misinformation in Marathi-language WhatsApp Groups" (EPW)
"Optimizing Transformer Architectures for Low-Resource NLP" (cs.CL)
"Urban Drone Logistics in Southeast Asia" (Frontiers in Sustainable Cities)
—
47 articles · 8 categories · gemini summary
Health Monitor
Hourly (silent unless problems): No message if everything is fine.
Alert (when something breaks):
🚨 VPS Health Alert — 2026-02-23 14:00 UTC
miniflux (Miniflux stack): NOT RUNNING
WARNING: Disk at 82% — 7.2GB free
Full status:
wg-easy: running
pihole: running
dnscrypt: running
syncthing: running
miniflux-db: running
Disk: 82% used (32.8GB / 40GB, 7.2GB free)
Memory: 61% used (2441MB / 4000MB, 1559MB available)
Load: 0.3 / 0.2 / 0.1 (1/5/15 min, 2 cores)
Failed SSH logins (24h): 14
Uptime: up 42 days, 3 hours, 12 minutes
Daily summary:
✅ VPS Health — 2026-02-23 07:00 UTC
wg-easy: running
pihole: running
dnscrypt: running
syncthing: running
miniflux: running
miniflux-db: running
Disk: 34% used (13.6GB / 40GB, 26.4GB free)
Memory: 58% used (2320MB / 4000MB, 1680MB available)
Load: 0.1 / 0.2 / 0.1 (1/5/15 min, 2 cores)
Failed SSH logins (24h): 7
Uptime: up 43 days, 3 hours, 12 minutes
Maintenance
Check logs
tail -30 ~/miniflux/scripts/digest.log
tail -30 ~/miniflux/scripts/health.log
Re-run today’s digest
rm ~/miniflux/scripts/.digest-state.json
~/miniflux/scripts/run.sh miniflux-telegram-digest.py
Change LLM provider
Edit ~/miniflux/scripts/.env:
LLM_PROVIDER=claude
ANTHROPIC_API_KEY=sk-ant-...
No code changes needed. Options: gemini (free), claude ($5/year at Haiku pricing), $5/year at GPT-4o-mini pricing). Cost estimates assume 1 request/day with a lightweight model — using larger models (Sonnet, GPT-4o) would cost more.openai (
Adjust digest timing
Edit crontab. 0 6 = 6:00 AM UTC — adjust to your timezone.
Test the bot manually
source ~/miniflux/scripts/.env
curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
-H "Content-Type: application/json" \
-d "{\"chat_id\": \"${TELEGRAM_CHAT_ID}\", \"text\": \"Test from VPS\"}"
Troubleshooting
Telegram returns 400 “chat not found” — You haven’t messaged the bot yet. Open the bot in Telegram, send any message, then retry.
Telegram returns 401 Unauthorized — Bot token is wrong. Check with @BotFather.
“No new entries since last run” — Normal if feeds haven’t published. To force: rm ~/miniflux/scripts/.digest-state.json
Digest has no links — The LLM sometimes ignores URL instructions. Re-run; it’s usually intermittent.
Health monitor says container not running — Check the container name matches exactly. Run docker ps --format '{{.Names}}' and compare with the DOCKER_STACKS list in vps-health-monitor.py.
Gemini returns 429 — Rate limited (unlikely at 1 req/day). Wait or switch to claude in .env.
Security Notes
- No ports opened — all connections are outbound from your VPS
- No domain or TLS needed — uses Telegram’s infrastructure for delivery
- Credentials on disk —
.envischmod 600, readable only by your user - Telegram bot token — if leaked, someone can send messages as your bot but cannot read your messages or access your VPS. Revoke via @BotFather →
/revoke - Miniflux API key — full read/write access to your Miniflux instance, but only used over localhost. If leaked, someone would need VPS access to exploit it
- Data passes through third parties — article titles, excerpts, and summaries go to Google (Gemini API) and Telegram. For public academic feeds this is low sensitivity. Avoid adding private or sensitive feeds without considering this
- Your VPS IP is visible to Telegram and Google via the outbound API calls
- Gemini API key in URL — Google’s API passes the key as a query parameter (
?key=...). The connection is HTTPS (encrypted in transit), but the key appears in Google’s server logs associated with your VPS IP. This is Google’s documented API design, not a misconfiguration — but be aware that any HTTP-level debugging or logging you add to the VPS could also capture the key
Files Reference
~/miniflux/scripts/
├── .env # API keys and config (chmod 600)
├── run.sh # Wrapper that loads .env
├── miniflux-telegram-digest.py # Daily digest script
├── vps-health-monitor.py # Health monitor script
├── .digest-state.json # Tracks processed entries (auto-generated)
├── digest.log # Digest cron output
└── health.log # Health monitor cron output
Browser Hardening: Your Device, Not Just Your Server
Everything above protects your traffic at the network and server level. But your browser itself leaks data — through cookies, fingerprinting, telemetry, and default search engines. This section addresses the device. No server configuration required; these are changes you make on your laptop and phone.
Switch to Firefox. Chrome is built by Google and integrated into Google’s data infrastructure. Firefox is open-source, maintained by a nonprofit (Mozilla), and designed to be configurable. This switch costs nothing and takes five minutes.
Install uBlock Origin. A browser extension that blocks ads and trackers at the page level — catching what Pi-hole cannot (notably YouTube ads and Facebook sponsored posts). It is free, open-source, and the single most effective privacy tool available in a browser.
Apply Arkenfox settings. Arkenfox is a community-maintained configuration file for Firefox that disables telemetry, hardens privacy defaults, and closes data leaks that Firefox leaves open out of the box. You download one file and place it in your Firefox profile directory. It is not an extension; it is a set of preferences. See: github.com/arkenfox/user.js
Change your default search engine to DuckDuckGo. DuckDuckGo does not track your searches or build a profile of your interests. For specialised academic searching, you will still use Google Scholar or field-specific databases — but your routine searches no longer feed a profile.
Test your setup. The Electronic Frontier Foundation’s “Cover Your Tracks” tool (coveryourtracks.eff.org) analyses your browser’s fingerprint and tracking exposure. Run it before and after these changes to see the difference.
What This Costs
| Component | Provider | Monthly cost |
|---|---|---|
| VPS (2 cores, 4 GB RAM) | Hetzner Cloud (CX22) | ~€4.50 / ~$5 |
| Backup storage (1 TB) | Hetzner Storage Box (BX11) | ~€3.80 / ~$4 |
| Domain name (optional) | Any registrar | ~$1/month (billed annually) |
| Total | ~$9–10/month |
Everything else — Docker, WireGuard, Pi-hole, dnscrypt-proxy, Syncthing, Miniflux, Kiwix, Stirling-PDF, Firefox, uBlock Origin, Arkenfox, Tailscale — is free and open-source software.
For comparison: iCloud (200 GB) is $2.99/month, Squarespace is $16/month, Dropbox Plus is $11.99/month, Adobe Acrobat is $12.99/month. The platform equivalent of this stack runs $40–50/month, with none of the privacy or control benefits.
What This Will Take
Time to build: If you are starting from zero, expect the foundation (Part 1) and Tailscale/storage (Part 2) to take a weekend of focused work. The library (Part 3) takes an afternoon. The RSS reader (Part 4) takes a few hours. Telegram automation (Part 5) takes an evening. Browser hardening takes an hour.
Ongoing maintenance: A few hours per month. Containers occasionally need updating. RSS feeds break when journals change their URLs. Backup logs should be checked periodically. The AI digest bot sometimes needs its prompts adjusted. None of this is urgent or difficult, but it is real. You are committing to an ongoing maintenance relationship with your infrastructure — and that relationship is the point, not a side effect.
Technical skill required: You do not need to be a programmer. You need to be comfortable with a terminal (text-based command line), willing to read documentation, and patient with error messages.
What This Will Not Do
This will not make you anonymous. Your VPS provider knows your identity and billing information. If served with a legal order, they can associate your server with your identity. This is private infrastructure, not clandestine infrastructure.
This will not protect you from a determined state-level adversary. It protects against commercial surveillance, ISP logging, and the ambient data extraction of platform capitalism. If your threat model involves government surveillance, you need additional tools (Tor, Tails) and additional expertise beyond the scope of this guide.
This will not replace collaborative platforms. You still need email, video conferencing, learning management systems, and institutional tools. What changes is the proportion of your digital life that passes through platforms you do not control. The goal is not total exit. It is a reduction in the surface area of platform dependency, and an increase in your understanding of the dependencies that remain.
This requires maintenance. It is not a product you purchase and forget. It is a practice you maintain. If that sounds like a cost, consider: the alternative is paying someone else to maintain it for you, on terms you cannot inspect, with your data as part of the payment.
A Suggested Order of Operations
If you want to start small and build gradually:
Week 1: Rent a VPS. Install Docker. Set up Tailscale. Get comfortable with SSH. (Part 1, Steps 1–2; Part 2, Tailscale section.) Note: the guide builds the VPN stack before Tailscale, but installing Tailscale early gives you a private mesh from the start — useful for accessing services you’ll add later without relying on SSH tunnels.
Week 2: Deploy Pi-hole. This is visible and immediately satisfying — you will see tracking requests being blocked in real time. (Part 1, Steps 3–6.)
Week 3: Add WireGuard. Route your devices through the VPN. Configure split-tunneling exceptions for banking and government portals. (Part 1, Steps 7–9, Split Tunneling.)
Week 4: Deploy Miniflux. Subscribe to 20–30 RSS feeds from journals and blogs in your field. (Part 4.)
Week 5: Add Syncthing. Move your most-used files off iCloud or Google Drive. Set up Stirling-PDF. (Part 1, Syncthing section; Part 2, Stirling PDF section.)
After that: Add dnscrypt-proxy for DNS encryption. Set up automated backups (Part 2, BorgBackup). Configure volatile logging (Part 1, Step 11). Harden your browser. Build the library (Part 3) if you want it. Set up the Telegram digest (Part 5). Each addition takes hours, not days, because the foundation is already in place.
Appendix: Optional Services
Additional self-hosted tools that complement the core infrastructure. Each is a single Docker container on the existing VPS, accessible via Tailscale. Install whichever ones are useful — none depend on each other.
For each service below: add it to DOCKER_STACKS in ~/miniflux/scripts/vps-health-monitor.py, add its directory to the borg create paths in ~/backup.sh, and test with ~/miniflux/scripts/run.sh vps-health-monitor.py --daily.
Uptime Kuma — Status Page & Uptime Monitor
A prettier, more capable alternative to the custom health monitor script. Checks HTTP endpoints, TCP ports, DNS, Docker containers, and sends alerts to Telegram, email, Slack, or Ntfy. Includes a public or private status page.
mkdir -p ~/uptime-kuma
nano ~/uptime-kuma/docker-compose.yml
services:
uptime-kuma:
image: louislam/uptime-kuma:latest
container_name: uptime-kuma
volumes:
- uptime-kuma-data:/app/data
ports:
- "127.0.0.1:3001:3001"
- "YOUR_TAILSCALE_IP:3001:3001"
restart: unless-stopped
volumes:
uptime-kuma-data:
cd ~/uptime-kuma && docker compose up -d
Access: http://YOUR_TAILSCALE_IP:3001
First visit: create an admin account. Then add monitors for each service:
| Monitor Type | Target | Interval |
|---|---|---|
| HTTP | http://127.0.0.1:8585/api/health | 60s |
| HTTP | http://127.0.0.1:8888 | 60s |
| HTTP | http://127.0.0.1:8484 | 60s |
| HTTP | http://127.0.0.1:8090 | 60s |
| TCP | 127.0.0.1:51820 | 60s |
Configure Telegram notifications: Settings → Notifications → Add → Telegram → enter your bot token and chat ID (same ones from ~/miniflux/scripts/.env).
Health monitor addition:
("Uptime Kuma", "~/uptime-kuma", ["uptime-kuma"]),
~30 MB RAM. Can coexist with your Python health monitor or eventually replace it.
Docker socket warning: Some Uptime Kuma tutorials recommend mounting /var/run/docker.sock into the container for direct Docker monitoring. Do not do this. Access to the Docker socket is equivalent to root access on the host — a compromised container with socket access can control every other container and the host OS. The HTTP health checks listed above achieve the same monitoring without this risk.
Ntfy — Self-Hosted Push Notifications
Push notifications directly to your phone via your own server. Replaces Telegram as the notification channel if you want to remove that dependency. Works on Android (native app) and iOS (via web push).
mkdir -p ~/ntfy
nano ~/ntfy/docker-compose.yml
services:
ntfy:
image: binwiederhier/ntfy:latest
container_name: ntfy
command: serve
volumes:
- ntfy-cache:/var/cache/ntfy
- ntfy-data:/etc/ntfy
ports:
- "127.0.0.1:2586:80"
- "YOUR_TAILSCALE_IP:2586:80"
environment:
- NTFY_BASE_URL=http://YOUR_TAILSCALE_IP:2586
restart: unless-stopped
volumes:
ntfy-cache:
ntfy-data:
cd ~/ntfy && docker compose up -d
Access: http://YOUR_TAILSCALE_IP:2586
Send a Test Notification
curl -d "Test from VPS" http://YOUR_TAILSCALE_IP:2586/vps-alerts
Subscribe on Your Phone
Install the Ntfy app (Android: Play Store, iOS: App Store). Add a subscription to http://YOUR_TAILSCALE_IP:2586/vps-alerts.
Use in Scripts
Replace Telegram API calls with:
import urllib.request
urllib.request.urlopen(
urllib.request.Request(
"http://YOUR_TAILSCALE_IP:2586/vps-alerts",
data=b"Backup completed successfully",
)
)
Health monitor addition:
("Ntfy", "~/ntfy", ["ntfy"]),
~10 MB RAM.
Gitea — Self-Hosted Git
Version control for your Hugo sites, scripts, configs, and the Gutenberg search app. A private GitHub without the platform dependency. Lightweight — uses SQLite by default, no separate database needed.
mkdir -p ~/gitea
nano ~/gitea/docker-compose.yml
services:
gitea:
image: gitea/gitea:latest
container_name: gitea
volumes:
- gitea-data:/data
ports:
- "127.0.0.1:3300:3000"
- "YOUR_TAILSCALE_IP:3300:3000"
environment:
- GITEA__database__DB_TYPE=sqlite3
- GITEA__server__ROOT_URL=http://YOUR_TAILSCALE_IP:3300/
- GITEA__server__DOMAIN=YOUR_TAILSCALE_IP
- GITEA__service__DISABLE_REGISTRATION=true
restart: unless-stopped
volumes:
gitea-data:
cd ~/gitea && docker compose up -d
Access: http://YOUR_TAILSCALE_IP:3300
First visit: complete the setup wizard (accept SQLite defaults). Create your admin account. Registration is disabled — you create accounts manually.
Add Your First Repository
On the VPS:
cd ~/gutenberg-search
git init
git add -A
git commit -m "Initial commit"
git remote add origin http://YOUR_TAILSCALE_IP:3300/YOUR_USERNAME/gutenberg-search.git
git push -u origin main
Repeat for ~/kiwix, ~/stirling-pdf, your Hugo source, etc. Now every config change is tracked with history.
Clone on Your Mac
git clone http://YOUR_TAILSCALE_IP:3300/YOUR_USERNAME/gutenberg-search.git
Works from any device on your Tailscale mesh.
Health monitor addition:
("Gitea", "~/gitea", ["gitea"]),
~100 MB RAM with SQLite.
Excalidraw — Self-Hosted Whiteboard
A collaborative whiteboard and diagramming tool. Useful for teaching prep, conference presentation diagrams, research sketches. Saves drawings as JSON files.
mkdir -p ~/excalidraw
nano ~/excalidraw/docker-compose.yml
services:
excalidraw:
image: excalidraw/excalidraw:latest
container_name: excalidraw
ports:
- "127.0.0.1:5000:80"
- "YOUR_TAILSCALE_IP:5000:80"
restart: unless-stopped
cd ~/excalidraw && docker compose up -d
Access: http://YOUR_TAILSCALE_IP:5000
No account needed — it’s a static app that runs in your browser. Drawings are saved locally in the browser or exported as PNG/SVG. For persistent storage, export drawings and keep them in your Syncthing-synced folder.
Health monitor addition:
("Excalidraw", "~/excalidraw", ["excalidraw"]),
~20 MB RAM.
PrivateBin — Encrypted Pastebin
Share text snippets, code, interview excerpts, or draft paragraphs with collaborators. Everything is encrypted client-side — the server never sees plaintext. Links auto-expire. Replaces Google Docs for quick, disposable sharing.
mkdir -p ~/privatebin
nano ~/privatebin/docker-compose.yml
services:
privatebin:
image: privatebin/nginx-fpm-alpine:latest
container_name: privatebin
volumes:
- privatebin-data:/srv/data
ports:
- "127.0.0.1:8443:8080"
- "YOUR_TAILSCALE_IP:8443:8080"
restart: unless-stopped
volumes:
privatebin-data:
cd ~/privatebin && docker compose up -d
Access: http://YOUR_TAILSCALE_IP:8443
Paste text, set an expiration (5 minutes to never), optionally set a password, click Send. Share the URL with your collaborator — they need Tailscale access to reach it, which limits sharing to people on your mesh. For external sharing, you’d need to expose it through a reverse proxy with a public domain.
Health monitor addition:
("PrivateBin", "~/privatebin", ["privatebin"]),
~15 MB RAM.
CyberChef — Data Transformation Toolkit
A browser-based toolbox for encoding, decoding, hashing, parsing, formatting, compressing, and hundreds of other data operations. Occasionally indispensable for data cleaning, format conversion, or inspecting encoded text. Run by GCHQ (open source).
mkdir -p ~/cyberchef
nano ~/cyberchef/docker-compose.yml
services:
cyberchef:
image: ghcr.io/gchq/cyberchef:latest
container_name: cyberchef
ports:
- "127.0.0.1:8817:80"
- "YOUR_TAILSCALE_IP:8817:80"
restart: unless-stopped
cd ~/cyberchef && docker compose up -d
Access: http://YOUR_TAILSCALE_IP:8817
No account, no state — it’s a static web app. Drag operations into the recipe pane, paste input, get output. Everything runs in your browser; the server just hosts the static files.
Health monitor addition:
("CyberChef", "~/cyberchef", ["cyberchef"]),
~10 MB RAM.
Full Port Summary (All Services)
| Port | Service | Status |
|---|---|---|
| 51820/udp | WireGuard | Core |
| 51821/tcp | wg-easy admin | Core (SSH tunnel) |
| 80/tcp | Pi-hole dashboard | Core (SSH tunnel) |
| 8090/tcp | Miniflux | Core (Tailscale) |
| 8888/tcp | Kiwix | Core (Tailscale) |
| 8585/tcp | Gutenberg Search | Core (Tailscale) |
| 8484/tcp | Stirling PDF | Core (Tailscale) |
| 3001/tcp | Uptime Kuma | Optional (Tailscale) |
| 2586/tcp | Ntfy | Optional (Tailscale) |
| 3300/tcp | Gitea | Optional (Tailscale) |
| 5000/tcp | Excalidraw | Optional (Tailscale) |
| 8443/tcp | PrivateBin | Optional (Tailscale) |
| 8817/tcp | CyberChef | Optional (Tailscale) |
All optional services combined add ~185 MB RAM. On a 4 GB VPS, this is feasible alongside the core stack but monitor memory usage if you install several.