Tag: self-hosted

ClamAV on Steroids: 35,000 YARA Rules and a Lot of Attitude
You can test it here: av.sandkiste.io
Introduction
If you’re anything like me, you’ve probably had one of those random late-night thoughts:
What if I built a scalable cluster of ClamAV instances, loaded it up with 35,000 YARA rules, and used it to really figure out what a file is capable of , whether it’s actually a virus or just acting suspicious?
It’s the kind of idea that starts as a “wouldn’t it be cool” moment and then slowly turns into “well… now I have to build it.“
And if that thought has never crossed your mind, that’s fine – because I’m going to walk you through it anyway.
How it Started
Like many of my projects, this one was born out of pure anger.
I was told, with a straight face, that scaling our ClamAV cluster into something actually usable would take multiple people, several days, extra resources, and probably outside help.
I told them I would do this in an afternoon, fully working, with REST API and Frontend
They laughed.
That same afternoon, I shipped the app.
How It’s Going
Step one: You upload a file.
The scanner gets to work and you wait for it to finish:
Once it’s done, you can dive straight into the results:
That first result was pretty boring.
So, I decided to spice things up by testing the Windows 11 Download Helper tool, straight from Microsoft’s own website.
You can see it’s clean , but it does have a few “invasive” features.
Most of these are perfectly normal for installer tools.
This isn’t a sandbox in the traditional sense. YARA rules simply scan the text inside files, looking for certain patterns or combinations, and then infer possible capabilities. A lot of the time, that’s enough to give you interesting insights, but it’s not a replacement for a full sandbox if you really want to see what the file can do in action.
The Setup
Here’s what you need to get this running:
- HAProxy: for TLS-based load balancing
- 2 ClamAV instances: plus a third dedicated to updating definitions
- Malcontent: YARA Scanner
- Database: to store scan results
You’ll also need a frontend and an API… but we’ll get to that part soon.
YAML
```
services:

  haproxy:
    image: haproxy:latest
    restart: unless-stopped
    ports:
      - "127.0.0.1:3310:3310"
    volumes:
      - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
    networks:
      - clam-net
    depends_on:
      - clamd1
      - clamd2

  clamd1:
    image: clamav/clamav-debian:latest
    restart: unless-stopped
    networks:
      - clam-net
    volumes:
      - ./tmp/uploads:/scandir
      - clamav-db:/var/lib/clamav
    command: ["clamd", "--foreground=true"]

  clamd2:
    image: clamav/clamav-debian:latest
    restart: unless-stopped
    networks:
      - clam-net
    volumes:
      - ./tmp/uploads:/scandir
      - clamav-db:/var/lib/clamav
    command: ["clamd", "--foreground=true"]

  freshclam:
    image: clamav/clamav-debian:latest
    restart: unless-stopped
    networks:
      - clam-net
    volumes:
      - clamav-db:/var/lib/clamav
    command: ["freshclam", "-d", "--foreground=true", "--checks=24"]

  mariadb:
    image: mariadb:latest
    restart: unless-stopped
    environment:
      MARIADB_ROOT_PASSWORD: SECREEEEEEEET
      MARIADB_DATABASE: avscanner
      MARIADB_USER: avuser
      MARIADB_PASSWORD: SECREEEEEEEET2
    volumes:
      - mariadb-data:/var/lib/mysql
    ports:
      - "127.0.0.1:3306:3306"

volumes:
  mariadb-data:
  clamav-db:

networks:
  clam-net:
```
Here’s my haproxy.cfg:
haproxy.cfg
```
global
    daemon
    maxconn 256

defaults
    mode tcp
    timeout connect 5s
    timeout client  50s
    timeout server  50s

frontend clamscan
    bind *:3310
    default_backend clamd_pool

backend clamd_pool
    balance roundrobin
    server clamd1 clamd1:3310 check
    server clamd2 clamd2:3310 check
```
Now you’ve got yourself a fully functioning ClamAV cluster, yay 🦄🎉!
FastAPI
I’m not going to dive deep into setting up an API with FastAPI (their docs cover that really well), but here’s the code I use:
Python
```
@app.post("/upload")
async def upload_and_scan(files: List[UploadFile] = File(...)):
    results = []

    for file in files:
        upload_id = str(uuid.uuid4())
        filename = f"{upload_id}_{file.filename}"
        temp_path = UPLOAD_DIR / filename

        with temp_path.open("wb") as f_out:
            shutil.copyfileobj(file.file, f_out)

        try:
            result = scan_and_store_file(
                file_path=temp_path,
                original_filename=file.filename,
            )
            results.append(result)
        finally:
            temp_path.unlink(missing_ok=True)

    return {"success": True, "data": {"result": results}}
```
There’s a lot more functionality in other functions, but here’s the core flow:
1. Save the uploaded file to a temporary path
2. Check if the file’s hash is already in the database (if yes, return cached results)
3. Use pyclamd to submit the file to our ClamAV cluster
4. Run Malcontent as the YARA scanner
5. Store the results in the database
6. Delete the file
Here’s how I use Malcontent in my MVP:
Python
```
def analyze_capabilities(filepath: Path) -> dict[str, Any]:
    path = Path(filepath).resolve()
    if not path.exists() or not path.is_file():
        raise FileNotFoundError(f"File not found: {filepath}")

    cmd = [
        "docker",
        "run",
        "--rm",
        "-v",
        f"{path.parent}:/scan",
        "cgr.dev/chainguard/malcontent:latest",
        "--format=json",
        "analyze",
        f"/scan/{path.name}",
    ]

    try:
        result = subprocess.run(cmd, capture_output=True, text=True, check=True)
        return json.loads(result.stdout)
    except subprocess.CalledProcessError as e:
        raise RuntimeError(f"malcontent failed: {e.stderr.strip()}") from e
    except json.JSONDecodeError as e:
        raise ValueError(f"Invalid JSON output from malcontent: {e}") from e
```
I’m not going to get into the whole frontend, it just talks to the API and makes things look nice.
For status updates, I use long polling instead of WebSockets. Other than that, it’s all pretty straightforward.
Final Thoughts
I wanted something that could handle large files too and so far, this setup delivers, since files are saved locally. For a production deployment, I’d recommend using something like Kata Containers, which is my go-to for running sketchy, untrusted workloads safely.
Always handle malicious files with caution. In this setup, you’re not executing anything, so you should mostly be safe, but remember, AV systems themselves can be exploited, so stay careful.
As for detection, I don’t think ClamAV alone is enough for solid malware protection. It’s better than nothing, but its signatures aren’t updated as frequently as I’d like. For a truly production-grade solution, I’d probably buy a personal AV product, build my own cluster and CLI tool for it, and plug that in. Most licenses let you use multiple devices, so you could easily scale to 10 workers for about €1.50 a month (just grab a license from your preferred software key site).
Of course, this probably violates license terms. I’m not a lawyer 😬
Anyway, I just wanted to show you something I built, so I built it, and now I’m showing it.
One day, this will be part of my Sandkiste tool suite. I’m also working on a post about another piece of Sandkiste I call “Data Loss Containment”, but that one’s long and technical, so it might take a while.
Love ya, thanks for reading, byeeeeeeee ❤️
2025-08-11
Forget ChatGPT, I Built My Own Local AI with AMD RX 7900 XTX, Ollama & DeepSeek-R1
The journey to bringing you this guide was paved with rage and hardship. Before we go any further, let me be clear: local AI is nowhere near as good as ChatGPT or similar online tools. Without solid prompt engineering, you’ll mostly get weird, useless responses.
That said, DeepSeek-R1 (32B) is hands down the best local model I’ve ever used—but even then, it’s nowhere near the level of ChatGPT-4o in the cloud. To match that, you’d need the DeepSeek-R1 671B model, which is a mind-blowing 404GB. Running that locally? Yeah, that would be absolute madness.
Disclaimer: This post has some strong opinions about Linux distributions and hardware that some people may find disturbing or hurtful. Please don’t take it too serious.
Rant about AMD
Skip it, or read my raw unfiltered anger.
The image of this post perfectly reflects my mood.
A while ago, I decided to build an AI server at home to run models locally. My plan was to get an NVIDIA 4090, which at the time cost around 2000€. But then, my friend—who runs Arch as his daily driver (I should’ve seen the red flag)—was using an AMD RX 7900 XTX, which was only 900€ at the time. He hyped it up, saying, “Oh yeah, get this one! Same VRAM, super easy to set up, everything works flawlessly!”
I was intrigued.
As fate would have it, another friend echoed the same thing, insisting that for 24GB of VRAM, I wouldn’t find anything cheaper. And, well, that was actually true.
However, everything I read online told me that AMD GPUs lag far behind NVIDIA in every way, and worst of all, you’d always have to hack things together just to make them work. Still, on Black Friday, I caved and bought the AMD GPU.
I regret it every single day since putting it in. I hate it. It absolutely sucks.
So far, it has worked on Windows 11—but even there, it was a pain. And seriously, how do you even mess up Windows 11 support??
Then I switched to Ubuntu as my main OS (☹️). After two days of struggle (and reinstalling the entire OS three times), I somehow got it to work. I still don’t know what I did. Every guide on the internet gives different commands, different settings, and different advice. Most are for older AMD GPUs, almost none work for the newer models, and—just for fun—most of the essential tools don’t support the “new” AMD cards either.
I hate it. I hate it so much.
My mood
I will never buy an AMD GPU ever again. Even if they came with 100GB of VRAM and cost just 5€, I do not care.
Looking back, I would rather pay 2000€ for a GPU that just works than spend endless hours hacking together the most basic functionality. The sheer frustration of dealing with this mess infuriates me beyond words.
This post serves as both a rant and a personal reminder: Never. Ever. Ever. Buy. AMD. Hardware. Again.
To be honest, I’m just as disappointed in AMD CPUs. Their hardware transcoding is absolute trash.
From now on, it’s Intel and NVIDIA, forever and always.

Prerequisite
- 32GB RAM (with ComfyUI, bump that up to 40GB)
- 250GB SSD Storage
- Debian 12 LXC
If you are more curious about my exact setup you’ll find a detailed list where you can check if yours is similar here: My Home Server: “PrettyLittleKitten” – A Personal Tech Haven. At the very least, your GPU should match (AMD RX 7900 XTX) to follow the tutorial step by step. If it doesn’t, chances are it’ll fail.
You need to install the kernel drivers on the host for passthrough to an LXC:
```
apt update
apt install pve-headers dkms
apt install amdgpu-dkms
```
You need to reboot Proxmox after that.
```
reboot
```
Setting up LXC
It’s important to note that the LXC must be privileged. I know there are guides for setting up an unprivileged one, but literally none of them worked—which only fueled my rage to unbearable levels.
So yeah… I just went with privileged.
Easy mode Proxmox VE Helper-Scripts:
```
bash -c "$(wget -qLO - https://github.com/community-scripts/ProxmoxVE/raw/main/ct/debian.sh)"
```
Or, just manually add a Debian 12 LXC in the Proxmox GUI. Make sure to enable nesting=1, as this is required for Docker to work later.
I did try to get this working on Ubuntu 24 for hours… and failed miserably. Don’t ask me why—I have no idea.
Now, let’s install some dependencies that we’ll need later—or at the very least, ones that will make our lives easier:
```
apt update && apt upgrade -y
apt install sudo curl jq -y
cd /tmp
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh ./get-docker.sh 
```
You are free to install docker without using the convenience script.
GPU Passthrough
This heavily depends on your hardware and software.
If, like me, you have an AMD RX 7900 XTX and Proxmox 8.3.3, then you can just follow along. Otherwise—based on my own painful experience—you’ll likely need to find another guide.
Inside the LXC, run:
```
cat /etc/group | grep -w 'render\|\video'
```
This will display the GIDs you need for passthrough in a second—so make sure to note them down:
```
video:x:44:root # <- /dev/kfd
render:x:993:root # <- /dev/dri/render*
```
The 44 will be used for “video” and the 993 for “render”. Yours will, of course, be different.
On the Proxmox host run the following command:
```
lspci | grep VGA
ls -l /sys/class/drm/renderD*/device
```
The output should be something like this:
```
lspci | grep VGA
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 31 [Radeon RX 7900 XT/7900 XTX] (rev c8)
11:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Raphael (rev cb)

ls -l /sys/class/drm/renderD*/device
lrwxrwxrwx 1 root root 0 Feb  2 13:33 /sys/class/drm/renderD128/device -> ../../../0000:03:00.0
lrwxrwxrwx 1 root root 0 Feb  2 13:33 /sys/class/drm/renderD129/device -> ../../../0000:11:00.0
```
In my case, “renderD128” is the part I need. To find yours, match the ID from the first command (e.g., 03:00.0) with the ID from the second command (e.g., 0000:03:00.0). Once they match, you’ll know which renderD* device corresponds to your GPU (the other one is the iGPU of the CPU, don’t use that).
In the Proxmox GUI, go to your LXC container’s “Resources” tab and click “Add” → “Device Passthrough“. Now, add the “video” and “render” devices using the GIDs you noted earlier:
- Render device: Use the path for your graphics card and the GID from the LXC output.
- Video device: Use /dev/kfd and the GID for “video” from the LXC output.
This is what your settings should look like (you may need to restart the LXC first).
After a restart of the container check to see if permission are correct:
```
ls -lah /dev/kfd /dev/dri/renderD128 # <- edit the ID
```
The output should look like this:
```
crw-rw---- 1 root render 226, 128 Feb  2 18:01 /dev/dri/renderD128
crw-rw---- 1 root video  236,   0 Feb  2 18:01 /dev/kfd
```
Make sure that “root render” is the GPU and “root video” the Kernel Fusion Driver (kfd).
Kernel Fusion Driver
If you want to run ROCm-based GPU compute workloads, such as machine learning, OpenCL, or scientific computing, on your AMD GPU within Proxmox. It acts as the interface between the AMD GPU driver and user-space applications, enabling GPU acceleration for parallel computing tasks.
– ChatGPT-4o
Install AMD Software
We need to install some tools inside of our Debian LXC:
```
apt update && apt upgrade -y
cd /tmp
wget https://repo.radeon.com/amdgpu-install/6.2.4/ubuntu/noble/amdgpu-install_6.2.60204-1_all.deb
sudo apt install ./amdgpu-install_6.2.60204-1_all.deb
amdgpu-install --usecase=rocm --no-dkms
```
You can also refer to the official guide: Quick Start Installation Guide – ROCm (at the very least, check if the links are still valid by the time you’re reading this).
The download might take a while. Since I have a new AMD RX 7900 XTX, I need to use:
```
export HSA_OVERRIDE_GFX_VERSION=11.0.0
echo 'export HSA_OVERRIDE_GFX_VERSION=11.0.0' >> ~/.bashrc
source ~/.bashrc
```
The next step is to test if everything is working. We’ll use amdgpu_top for this (you can skip this, but I wouldn’t):
```
sudo apt install libdrm-dev
cd /tmp
wget https://github.com/Umio-Yasuno/amdgpu_top/releases/download/v0.10.1/amdgpu-top_without_gui_0.10.1-1_amd64.deb
dpkg -i amdgpu-top_without_gui_0.10.1-1_amd64.deb 
amdgpu_top
```
You should see the name of your GPU and see some values below.
Run this command inside of the LXC to see if everything worked correctly:
```
/opt/rocm/bin/rocminfo
```
Somewhere in that “rocminfo” output you should see your GPU (and a bunch of other nerd stuff):
```
  Marketing Name:          Radeon RX 7900 XTX
  Vendor Name:             AMD
```
🤩 Sweet!
Installing Ollama
This will also take a while. Just follow the guide on the Ollama website —here’s all you need to do:
```
apt install curl 
cd /tmp
curl -fsSL https://ollama.com/install.sh | sh
```
Since curl is not installed by default in the Debian LXC we’re using, we’ll need to install it first (if you filled this guide, you have it already). Then, we’ll run the install script from the Ollama website. Be patient—the download takes a while since it pulls about 30GB of data.
By the way, I love the Ollama website. The simple black-and-white design with rounded borders? 🤌 I’m a sucker for minimalistic aesthetics. (I hope you like my blog’s design too! 🕺)
Next Step: Testing with a Smaller Model
Before downloading the DeepSeek 32B model, we’ll first test with a smaller DeepSeek version. If your GPU matches mine, the larger model should work fine.
You can check the available models on the Ollama Models website.
Now, run this command—it’s going to download a 1.5GB file, so the wait time depends on your internet speed:
```
ollama pull deepseek-r1:1.5b
```
You can then test:
```
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "deepseek-r1:1.5b",
  "prompt": "Tell me a funny story about my best friend Karl. 300 characters maximum.",
  "stream": false
}' | jq .
```
Once upon a time, in the quiet town of Pawsley, there was Karl, a beloved kind cat who adored his three feline friends: Sam, Max, and Lily. Karl always had a warm smile and a habit of aiding others, often finding humor in unexpected places.
One crisp autumn afternoon, Karl spotted a bag of marbles on the park’s bench. Curious, he decided to play with them. To everyone’s surprise, a man walked by, holding a jar full of coins instead of marbles. “Hey, it’s you!” Karl exclaimed. The man, initially unimpressed, asked for his change. Karl suggested taking one marble in exchange and gave him the coins.
“Thank you,” the man thanked. Karl offered him a glass of water, knowing the jar was empty. “That’ll keep me happy,” he said. But as the water filled the glass, a strange ripple appeared on the floor. The man ran over to check but found nothing—no coins, just a tiny, pink spot.
“Wait, is it my pet?!” Karl pointed. The man laughed, looking suspiciously at Karl. “You saw that marble in there.” He handed Karl a piece of paper and said,”I need your help.”
Karl rolled on his back and rolled into the street. From behind him, he saw Sam, Max, and Lily with their usual spotlights on. Karl placed a paw on Sam’s paw, suggesting they might be related.
“Hey!” Max excitedly exclaimed. “My little sister is getting sick! Can you help me find her?”
Karl hesitated but followed Max to the neighbor’s yard. His friend found Lily, who was hiding somewhere. Karl helped Lily bring her back home, while Sam and Max watched in laughter. Karl’s kind words were met with cheers.
In the end, Karl became a beloved companion among Pawsley’s residents, not just for his love of marbles but also for his warmth and kindness.
Never interrupt an AI artist in the middle of their masterpiece. 🎨🤖 But yeah, the 300-Character-Limit was ignored.
Cleanup
Let’s get rid of the old files. But first, double-check before running these commands—they’ll wipe out all .sh and .deb files in /tmp.
```
ls -lh /tmp/*.sh /tmp/*.deb
```
The command above shows you what the command below will delete. Proceed if the displayed files are trash. The deleted files are irreversibly gone after this command:
```
rm -f /tmp/*.sh /tmp/*.deb
apt autoremove
```
Testing OpenWebUI with DeepSeek-R1 32B
The prompt I gave DeepSeek was “Give me a 10 word super short story about a cat named Paul.” – which is the name of my actual cat.
Well, not quite 10 words, but close enough. This was a cold start, meaning the model wasn’t loaded into VRAM yet—so yeah, it took a bit longer.
DeepSeek really put in some serious effort here, like it was deep in thought, sweating over every word. 😂 Considering all that, the result is absolutely hilarious:
Okay, so I need to come up with a 10-word super short story about a cat named Paul. Hmm, let’s see… The user gave an example: “Paul the cat climbed the fence, chasing a bird, then napped in the sun.” That’s pretty straightforward but conveys a little action and a bit of character.
I should think about what elements make a good 10-word story. It needs to have a clear subject (Paul), some action or activity, maybe an obstacle or goal, and perhaps a resolution or consequence. Since it’s so short, each word counts, so I need to be concise but still vivid.
Let me brainstorm some ideas about Paul the cat. Maybe he’s curious, playful, adventurous, or lazy. The example has him chasing a bird and then napping. Perhaps another activity? Maybe exploring somewhere, finding something, or having an interaction with another animal or person.
I should think of a simple structure: subject (Paul), action verb, some context, maybe an obstacle or additional detail, and a resolution. For example, “Paul pounced on the mouse but it escaped.” That’s 8 words. Maybe add more details to make it 10.
Or perhaps something like “Paul sat by the window watching birds chirp.” That’s also 8 words. Maybe include movement: “Paul raced through the garden chasing a butterfly.”
Wait, the user provided an example that starts with Paul doing an action (climbed), then another action (chasing), and ends with resting (napped). So perhaps following a similar structure would work well.
Let me try: “Paul explored the attic, found a toy, and took a nap.” That’s 9 words. Maybe add more detail or action. Or “Paul climbed the tree, met another cat, then curled up to sleep.”
Wait, that’s 10 words. Hmm, but it feels a bit flat. Maybe make the interaction more engaging: “Paul chased a squirrel around the yard and finally rested.” That’s 9 words. Could add something like “in the shade” at the end for 10.
Alternatively, maybe include some emotion or consequence: “Paul pounced on the ball, rolled over, and fell asleep.”
Wait, let me check word count. Each idea needs to be exactly 10 words. Let’s see:
- Paul the cat climbed the fence, chasing a bird, then napped in the sun. (10 words)
So that’s the example given. I can try something similar but different.
Maybe focus on a different activity: “Paul discovered a sunny spot, curled up, and purred softly.” That’s 9 words. Maybe add more action or detail to make it 10.
Alternatively: “Paul leaped onto the counter, knocked over a bowl, then ate the leftovers.” That’s 10 words but maybe a bit long in description.
Another angle: “Paul wandered through the park, met a dog, and they played together.” That’s 9 words. Add something like “all day” at the end for 10.
Wait, perhaps include some obstacle or challenge: “Paul tried to catch a fish in the pond but got wet.” That’s 10 words. It includes an attempt, action, consequence.
Or maybe something more peaceful: “Paul stretched lazily on the rug, yawned, and drifted off.” 9 words. Add “to sleep” for 10.
I think I need to ensure each word adds value without being redundant. Let me try another approach:
- Paul chased a butterfly in the garden until it flew away. (That’s 8 words. Maybe add “then rested” at the end: 10 words.)
Wait, but that might be too similar to the example.
Alternatively, focus on something unique about Paul: “Paul, the curious cat, explored every nook and cranny.” That’s 9 words. Maybe add “all day long” for 12; that’s too many.
Perhaps include an interaction with another animal or a person: “Paul meowed at the door, demanding attention, then curled up.” 9 words. Add “by the fire” for 10.
Wait, perhaps combine actions and consequences more effectively. Let me think of a scenario where Paul does something that leads to a natural conclusion in just 10 words.
How about: “Paul sneaked into the kitchen, stole a treat, then hid under the bed.” That’s 9 words. Maybe add “to digest” for 10.
Or maybe something more adventurous: “Paul climbed a tall tree, got stuck, and miaowed for help.” That’s 9 words. Add “from his owner” for 11; that’s too many.
Alternatively: “Paul caught a moth in the dark, played with it, then slept soundly.” 9 words. Maybe add “through the night” for 10.
I think I’m overcomplicating it. Let me go back to the example and see if I can structure something similar but different. The example has two actions (climbed, chasing) followed by a rest. So maybe another set of actions leading to a consequence or resolution.
Perhaps: “Paul chased his tail in circles until he panted.” That’s 8 words. Maybe add “then stopped to rest” for 10.
Alternatively: “Paul jumped on the sofa, knocked over a lamp, then slept.” But that’s 9 words; maybe add “peacefully” at the end for 10.
Wait, perhaps focus on Paul’s interaction with an object or another animal. Let me think of something unique but still within the 10-word limit.
Another idea: “Paul found a cozy spot under the stairs and napped.” That’s 8 words. Maybe add “after exploring all day” for 10.
Wait, that would be “Paul found a cozy spot under the stairs after exploring all day.” That’s 9 words. Hmm.
Alternatively: “Paul purred contentedly while basking in sunlight.” 7 words. Add some activity before that to make it 10.
I think I need to stick to a structure where Paul does something, then another action or consequence, and maybe ends with a state or rest.
Let me try again: “Paul climbed the fence, chased a butterfly, then napped in shade.” That’s 9 words. Maybe change “then” to “and” for flow: “climbed, chased, and napped” but that might not add up correctly.
Wait, let’s count the example given: “Paul the cat climbed the fence, chasing a bird, then napped in the sun.” That’s 10 words. So it includes his name, his action (climbed), context (the fence), another action (chasing), and the consequence (napped in the sun). I need to mirror that structure but with different actions.
So maybe: “Paul the cat explored the garden, found a mouse, then relaxed by the pond.” That’s 10 words. It follows a similar flow—action, context, additional action, consequence.
Alternatively: “Paul the cat pounced on a toy, rolled around, and fell asleep.” 9 words. Add “on the rug” for 10.
Wait, maybe make it more dynamic: “Paul the cat raced through the park, met a friend, then rested under a tree.” That’s 10 words. It shows activity, interaction, and rest.
I think that’s a good structure—action, context, another action or interaction, consequence. So putting it all together, I can create a concise story within the word limit.
– DeepSeek-R1:32B thoughts
Here you can see the electricity usage—a solid 450W for two minutes until things calmed down. Usually the server “idles” at around 50-53W, since I have a local SIEM and this website (along with a lot of other stuff) runs on it, it is never really really idle.
Both hardware and software went all in, pushing their limits… all for my short, 10-word cat story:
This does not prove that DeepSeek-R1 is a waste of electricity and ends up with a subpar result, rather that you probably do not need an advanced thinking model for such a super simple task.
Check out dolphin-mistral, also on a cold start only taking 4 seconds. The second message did not even take a second:
Bonus: ComfyUI
The Official Installation Guide: How to Install ComfyUI on Linux Alright, OpenWebUI now supports ComfyUI for image generation (still in beta at the time of writing). Here is the super quick start for ComfyUI, run these commands in your LXC:
```
apt update && apt upgrade -y && apt install git apt install python3.11-venv -y
mkdir ~/comfy && cd ~/comfy
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ~/comfy/ComfyUI
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
pip install -r requirements.txt
```
Make sure to double-check the versions and links against your system and the official guide I linked at the top.
Once everything is set up, give it a test run to confirm it’s working as expected:
```
python main.py
```
If your output confirms that everything is running smoothly, go ahead and turn it into a system service:
```
nano /etc/systemd/system/comfyui.service
```
Paste the following into the service file:
```
[Unit]
Description=ComfyUI Service
After=network.target

[Service]
User=root
Group=root
WorkingDirectory=/root/comfy/ComfyUI
ExecStart=/root/comfy/ComfyUI/venv/bin/python /root/comfy/ComfyUI/main.py --listen 0.0.0.0
Restart=always

[Install]
WantedBy=multi-user.target
```
Now reload and start the comfyui.service:
```
sudo systemctl daemon-reload
sudo systemctl enable comfyui.service
sudo systemctl start comfyui.service
sudo systemctl status comfyui.service
```
ComfyUI Manager
ComfyUI Manager is the ultimate quality-of-life add-on for ComfyUI, making model and plugin installation a breeze. Just get it—you’ll thank me later:
```
cd ~/comfy/ComfyUI/custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager comfyui-manager
sudo systemctl restart comfyui.service
```
Restart takes a little while since it downloads some dependencies, but no worries—give it a minute or two, and everything should run smoothly.
Conclusion
You should now have a working Debian LXC for local AI tasks, equipped with:
- Ollama
- OpenWebUI
- ComfyUI (optional)
Setting this up, especially with newer AMD GPUs, used to be a bit of a headache. I’m honestly glad I waited a bit—getting this to work wasn’t exactly painless, and I had to do a lot of digging through forums. But hey, it’s running now!
Next up, get some models from the Ollama page. If you have an AMD RX 7900 XTX, you should be able to run the 32B version of DeepSeek-R1 effortlessly. Technically, you can load models larger than your VRAM, but be warned—it’ll slow things down.
Also, don’t forget to secure your AI server and add valid SSL certificates, check out my post about it:
How to Get Real Trusted SSL Certificates with ACME-DNS in Nginx Proxy Manager
I set up firewall rules centrally on my Dream Machine, so my AI server can only communicate with the reverse proxy.
If your setup is different, you might want to handle this locally using UFW.
These Goodbye Message are Brought to you by AI
Aww, yaaaay! I totally loooove all your amazing readers <3 Wishing you guys the biiiggest luck with everything you dooove, okay? Dayyyyyybeeee~ 💕✨
– qwen2.5:32b
Love y’all, keep slaying in everything you do <3 Can’t wait to see what awesome things you have coming up. Bye for now! 👋🏼😉
– dolphin-mistral
Goodbye image I made with Flux Schnell and ComfyUI
EDIT 1 (04.02.2024)
I have downloaded and tested almost all popular models now and the only actually usable one for daily business like rewriting German emails or asking for expertise in German is qwen2.5 so far.
The uncensored Dolphin models are a lot of fun, but also kind of stink with German, which is of course because their underlying models aren’t good at German either.
2025-02-03

Squidward:Continuous Observation and Monitoring

The name Squidward comes from TAD → Threat Modelling, Attack Surface and Data. “Tadl” is the German nickname for Squidward from SpongeBob, so I figured—since it’s kind of a data kraken—why not use that name?

It’s a continuous observation and monitoring script that notifies you about changes in your internet-facing infrastructure. Think Shodan Monitor, but self-hosted.

Technology Stack

certspotter: Keeps an eye on targets for new certificates and sneaky subdomains.
Discord: The command center—control the bot, add targets, and get real-time alerts.
dnsx: Grabs DNS records.
subfinder: The initial scout, hunting down subdomains.
rustscan: Blazing-fast port scanner for newly found endpoints.
httpx: Checks ports for web UI and detects underlying technologies.
nuclei: Runs a quick vulnerability scan to spot weak spots.
anew: Really handy deduplication tool.

At this point, I gotta give a massive shoutout to ProjectDiscovery for open-sourcing some of the best recon tools out there—completely free! Seriously, a huge chunk of my projects rely on these tools. Go check them out, contribute, and support them. They deserve it!

(Not getting paid to say this—just genuinely impressed.)

How it works

I had to rewrite certspotter a little bit in order to accomodate a different input and output scheme, the rest is fairly simple.

Setting Up Directories

The script ensures required directories exist before running:

$HOME/squidward/data for storing results.
Subdirectories for logs: onlynew, allfound, alldedupe, backlog.

Running Subdomain Enumeration

squidward (certspotter) fetches SSL certificates to discover new subdomains.
subfinder further identifies subdomains from multiple sources.
Results are stored in logs and sent as notifications (to a Discord webhook).

DNS Resolution

dnsx takes the discovered subdomains and resolves:

A/AAAA (IPv4/IPv6 records)
CNAME (Canonical names)
NS (Name servers)
TXT, PTR, MX, SOA records

HTTP Probing

httpx analyzes the discovered subdomains by sending HTTP requests, extracting:

Status codes, content lengths, content types.
Hash values (SHA256).
Headers like server, title, location, etc.
Probing for WebSocket, CDN, and methods.

Vulnerability Scanning

nuclei scans for known vulnerabilities on discovered targets.
The scan focuses on high, critical, and unknown severity issues.

Port Scanning

rustscan finds open ports for each discovered subdomain.
If open ports exist, additional HTTP probing and vulnerability scanning are performed.

Automation and Notifications

Discord notifications are sent after each stage.
The script prevents multiple simultaneous runs by checking if another instance is active (ps -ef | grep “squiddy.sh”).
Randomization (shuf) is used to shuffle the scan order.

Main Execution

If another squiddy.sh instance is running, the script waits instead of starting.

If no duplicate instance exists:
Squidward (certspotter) runs first.
The main scanning pipeline (what_i_want_what_i_really_really_want()) executes in a structured sequence:

The Code

I wrote this about six years ago and just laid eyes on it again for the first time. I have absolutely no clue what past me was thinking 😂, but hey—here you go:

#!/bin/bash

#############################################
#
# Single script usage:
# echo "test.karl.fail" | ./httpx -sc -cl -ct -location -hash sha256 -rt -lc -wc -title -server -td -method -websocket -ip -cname -cdn -probe -x GET -silent
# echo "test.karl.fail" | ./dnsx -a -aaaa -cname -ns -txt -ptr -mx -soa -resp -silent
# echo "test.karl.fail" | ./subfinder -silent
# echo "test.karl.fail" | ./nuclei -ni
#
#
#
#
#############################################

# -----> globals <-----
workdir="squidward"
script_path=$HOME/$workdir
data_path=$HOME/$workdir/data

only_new=$data_path/onlynew
all_found=$data_path/allfound
all_dedupe=$data_path/alldedupe
backlog=$data_path/backlog
# -----------------------

# -----> dir-setup <-----
setup() {
    if [ ! -d $backlog ]; then
        mkdir $backlog
    fi
    if [ ! -d $only_new ]; then
        mkdir $only_new
    fi
    if [ ! -d $all_found ]; then
        mkdir $all_found
    fi
    if [ ! -d $all_dedupe ]; then
        mkdir $all_dedupe
    fi
    if [ ! -d $script_path ]; then
        mkdir $script_path
    fi
    if [ ! -d $data_path ]; then
        mkdir $data_path
    fi
}
# -----------------------

# -----> subfinder <-----
write_subfinder_log() {
    tee -a $all_found/subfinder.txt | $script_path/anew $all_dedupe/subfinder.txt | tee $only_new/subfinder.txt
}
run_subfinder() {
    $script_path/subfinder -dL $only_new/certspotter.txt -silent | write_subfinder_log;
    $script_path/notify -data $only_new/subfinder.txt -bulk -provider discord -id crawl -silent
    sleep 5
}
# -----------------------

# -----> dnsx <-----
write_dnsx_log() {
    tee -a $all_found/dnsx.txt | $script_path/anew $all_dedupe/dnsx.txt | tee $only_new/dnsx.txt
}
run_dnsx() {
    $script_path/dnsx -l $only_new/subfinder.txt -a -aaaa -cname -ns -txt -ptr -mx -soa -resp -silent | write_dnsx_log;
    $script_path/notify -data $only_new/dnsx.txt -bulk -provider discord -id crawl -silent
    sleep 5
}
# -----------------------

# -----> httpx <-----
write_httpx_log() {
    tee -a $all_found/httpx.txt | $script_path/anew $all_dedupe/httpx.txt | tee $only_new/httpx.txt
}
run_httpx() {
    $script_path/httpx -l $only_new/subfinder.txt -sc -cl -ct -location -hash sha256 -rt -lc -wc -title \ 
    -server -td -method -websocket -ip -cname -cdn -probe -x GET -silent | write_httpx_log;
    $script_path/notify -data $only_new/httpx.txt -bulk -provider discord -id crawl -silent
    sleep 5
}
# -----------------------

# -----> nuclei <-----
write_nuclei_log() {
    tee -a $all_found/nuclei.txt | $script_path/anew $all_dedupe/nuclei.txt | tee $only_new/nuclei.txt
}
run_nuclei() {
    $script_path/nuclei -ni -l $only_new/httpx.txt -s high, critical, unknown -rl 5 -silent \
    | write_nuclei_log | $script_path/notify -provider discord -id vuln -silent
}
# -----------------------

# -----> squidward <-----
write_squidward_log() {
    tee -a $all_found/certspotter.txt | $script_path/anew $all_dedupe/certspotter.txt | tee -a $only_new/forscans.txt
}
run_squidward() {
    rm $script_path/config/certspotter/lock
    $script_path/squidward | write_squidward_log | $script_path/notify -provider discord -id cert -silent
    sleep 3
}
# -----------------------

send_certspotted() {
    $script_path/notify -data $only_new/certspotter.txt -bulk -provider discord -id crawl -silent
    sleep 5
}

send_starting() {
    echo "Hi! I am Squiddy!" | $script_path/notify  -provider discord -id crawl -silent
    echo "I am gonna start searching for new targets now :)" | $script_path/notify  -provider discord -id crawl -silent
}

dns_to_ip() {
    # TODO: give txt file of subdomains to get IPs from file 
    $script_path/dnsx -a -l $1 -resp -silent \
    | grep -oE "\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b" \
    | sort --unique 
}

run_rustcan() {
    local input=""

    if [[ -p /dev/stdin ]]; then
        input="$(cat -)"
    else
        input="${@}"
    fi

    if [[ -z "${input}" ]]; then
        return 1
    fi

    # ${input/ /,} -> join space to comma
    # -> loop because otherwise rustscan will take forever to scan all IPs and only save results at the end
    # we could do this to scan all at once instead: $script_path/rustscan -b 100 -g --scan-order random -a ${input/ /,}
    for ip in ${input}
    do
        $script_path/rustscan -b 500 -g --scan-order random -a $ip
    done

}

write_rustscan_log() {
    tee -a $all_found/rustscan.txt | $script_path/anew $all_dedupe/rustscan.txt | tee $only_new/rustscan.txt
}
what_i_want_what_i_really_really_want() {
    # shuffle certspotter file cause why not
    cat $only_new/forscans.txt | shuf -o $only_new/forscans.txt 

    $script_path/subfinder -silent -dL $only_new/forscans.txt | write_subfinder_log
    $script_path/notify -silent -data $only_new/subfinder.txt -bulk -provider discord -id subfinder

    # -> empty forscans.txt
    > $only_new/forscans.txt

    # shuffle subfinder file cause why not
    cat $only_new/subfinder.txt | shuf -o $only_new/subfinder.txt

    $script_path/dnsx -l $only_new/subfinder.txt -silent -a -aaaa -cname -ns -txt -ptr -mx -soa -resp | write_dnsx_log
    $script_path/notify -data $only_new/dnsx.txt -bulk -provider discord -id dnsx -silent
    
    # shuffle dns file before iter to randomize scans a little bit
    cat $only_new/dnsx.txt | shuf -o $only_new/dnsx.txt
    sleep 1
    cat $only_new/dnsx.txt | shuf -o $only_new/dnsx.txt

    while IFS= read -r line
    do
        dns_name=$(echo $line | cut -d ' ' -f1)
        ip=$(echo ${line} \
        | grep -E "\[(\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)\]" \
        | grep -oE "(\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)")
        match=$(echo $ip | run_rustcan)

        if [ ! -z "$match" ]
        then
            ports_unformat=$(echo ${match} | grep -Po '\[\K[^]]*')
            ports=${ports_unformat//,/ }

            echo "$dns_name - $ip - $ports" | write_rustscan_log
            $script_path/notify -silent -data $only_new/rustscan.txt -bulk -provider discord -id portscan
        
            for port in ${ports}
            do
                echo "$dns_name:$port" | $script_path/httpx -silent -sc -cl -ct -location \
                -hash sha256 -rt -lc -wc -title -server -td -method -websocket \
                -ip -cname -cdn -probe -x GET | write_httpx_log | grep "\[SUCCESS\]" | cut -d ' ' -f1 \
                | $script_path/nuclei -silent -ni -s high, critical, unknown -rl 10 \
                | write_nuclei_log | $script_path/notify -provider discord -id nuclei -silent

                $script_path/notify -silent -data $only_new/httpx.txt -bulk -provider discord -id httpx
            done
        fi 
    done < "$only_new/dnsx.txt"
}

main() {
    dupe_script=$(ps -ef | grep "squiddy.sh" | grep -v grep | wc -l | xargs)

    if [ ${dupe_script} -gt 2 ]; then
        echo "Hey friends! Squiddy is already running, I am gonna try again later." | $script_path/notify  -provider discord -id crawl -silent
    else 
        send_starting

        echo "Running Squidward"
        run_squidward

        echo "Running the entire rest"
        what_i_want_what_i_really_really_want

        # -> leaving it in for now but replace with above function
        #echo "Running Subfinder"
        #run_subfinder

        #echo "Running DNSX"
        #run_dnsx

        #echo "Running HTTPX"
        #run_httpx

        #echo "Running Nuclei"
        #run_nuclei
    fi
}

setup

dupe_script=$(ps -ef | grep "squiddy.sh" | grep -v grep | wc -l | xargs)
if [ ${dupe_script} -gt 2 ]; then
    echo "Hey friends! Squiddy is already running, I am gonna try again later." | $script_path/notify  -provider discord -id crawl -silent
else 
    #send_starting
    echo "Running Squidward"
    run_squidward
fi

There’s also a Python-based Discord bot that goes with this, but I’ll spare you that code—it did work back in the day 😬.

Conclusion

Back when I was a Red Teamer, this setup was a game-changer—not just during engagements, but even before them. Sometimes, during client sales calls, they’d expect you to be some kind of all-knowing security wizard who already understands their infrastructure better than they do.

So, I’d sit in these calls, quietly feeding their possible targets into Squidward and within seconds, I’d have real-time recon data. Then, I’d casually drop something like, “Well, how about I start with server XYZ? I can already see it’s vulnerable to CVE-Blah.” Most customers loved that level of preparedness.

I haven’t touched this setup in ages, and honestly, I have no clue how I’d even get it running again. I would probably go about it using Node-RED like in this post.

These days, I work for big corporate, using commercial tools for the same tasks. But writing about this definitely brought back some good memories.

Anyway, time for bed! It’s late, and you’ve got work tomorrow. Sweet dreams! 🥰😴

Have another scary squid man monster that didn’t make featured, buh-byeee 👋

2025-01-31

From Typos to Treason: The Dangerous Fun of Government Domain Squatting
Hey there 👋 Since you’re reading this, chances are you’ve got some chaos brewing in your brain—I love it.
For legal reasons I must kindly ask you to read and actually understand my disclaimer.
Disclaimer:
The information provided on this blog is for educational purposes only. The use of hacking tools discussed here is at your own risk.
For the full disclaimer, please click here.
Full full disclosure: I did have written permission to do this. And anything I didn’t have written permission for is wildly exaggerated fiction—pure imagination, no receipts, no logs, nothing but brain static.
Now, another fair warning: this post is about to get particularly hairy. So seriously, do not try this without proper written consent, unless you have an unshakable desire to land yourself in a world of trouble.
Intro
I get bored really easily 😪. And when boredom strikes, I usually start a new project. Honestly, the fact that I’m still sticking with this blog is nothing short of a miracle. Could this be my forever project? Who knows—place your bets.
Anyway, purely by accident, I stumbled across a tool that I immediately recognized as easy mode for typo squatting and bit squatting. The tool itself was kinda trash, but it did spark a deliciously questionable thought in my brain:
“Can I intercept sensitive emails from government organizations and snatch session tokens and API keys?”
To keep you on the edge of your seat (and slightly concerned), the answer is: Yes. Yes, I can. And trust me, it’s way worse than you think.
It’s always the stupidly simple ideas that end up working the best.
Typosquatting
Typosquatting, also called URL hijacking, a sting site, a cousin domain, or a fake URL, is a form of cybersquatting, and possibly brandjacking which relies on mistakes such as typos made by Internet users when inputting a website address into a web browser. A user accidentally entering an incorrect website address may be led to any URL, including an alternative website owned by a cybersquatter.
— Wikipedia
Basically, you register kark.fail, kick back, and wait for people to fat-finger karl.fail — and trust me, they will. Congratulations, you just hijacked some of my traffic without lifting a finger. It’s like phishing, but lazier.
Bitsquatting
Bitsquatting is a form of cybersquatting which relies on bit-flip errors that occur during the process of making a DNSrequest. These bit-flips may occur due to factors such as faulty hardware or cosmic rays. When such an error occurs, the user requesting the domain may be directed to a website registered under a domain name similar to a legitimate domain, except with one bit flipped in their respective binary representations.
— Wikipedia
You register a domain that is a single-bit off your target, on my site you could register “oarl.fail”
- ASCII of “k” = 01101011
- Flipping the third-to-last bit:
- 01101111 → This corresponds to “o”
- This changes “karl” → “oarl“
Personally I have had 0 success with this, but apparently still works.
The Setup
Now that you know the basics, you’re officially armed with enough knowledge to cause some mild chaos 🎉.
Here’s what we need to get started:
- Money – Because sadly, domains don’t buy themselves.
- A domain registrar account – I use Namecheap
- Cloudflare account (optional, but much recommended)
- A server connected to the internet – I use Hetzner (optional but also recommended)
Getting a Domain
You should probably know this if you’re planning to hack the government (or, you know, just theoretically explore some questionable cyberspace).
Step one:
Follow all the steps on Namecheap—or whichever registrar you fancy. You can probably find one that takes Bitcoin or Monero, if you want.
For generating typo domains effortlessly, I use ChatGPT:
```
Give me the top 5 most common typos english speaking people make for the domain "karl.fail" on a qwerty keyboard.
```
ChatGPT does not know .fail is a valid TLD, but you get the point.
Step two
Add your domain to Cloudflare—unless, of course, you’re feeling extra ambitious and want to host your own Mailserver and Nameserver. But let’s be real, why suffer?
Edit the “Nameservers” setting on Namecheap
Mailserver
I highly recommend Mailcow, though it might be complete overkill for this—unless your job involves hacking governments. In that case, totally worth it.
Nameserver
This is the best tutorial I could find for you—he’s using CoreDNS.
In my tests, I used Certainly, which built a small authoritative DNS server with this Go library.
The big perk of running your own nameserver is that you get to log every DNS query to your domain. As many pentesters know, DNS is passive recon—it doesn’t hit the target directly. That’s why you can get away with otherwise noisy tasks, like brute-forcing subdomains via DNS. But if your target runs their own nameserver, they’ll see you poking around.
I went with a different setup because DNS logs are a mess—super noisy and, honestly, boring. Everyone and their mom ends up enumerating your domain until kingdom come.
Beware! Different top-level domain organizations have different expectations for name servers. I ran into some trouble with the .de registry, DENIC—they insisted I set up two separate nameservers on two different IPs in two different networks. Oh, and they also wanted pretty SOA records before they’d even consider my .de domains.
Save yourself the headache—double-check the requirements before you spend hours wrecking yourself.
Hetzner Server
Any server, anywhere, will do—the goal is to host a web server of your choice and capture all the weblogs. I’ll be using Debian and Caddy for this.
The cheapest server on Hetzner
We’ll be building our own Caddy with the Cloudflare plugin because I couldn’t get wildcard certificates to work without it. Plus, I always use Cloudflare (❤️ you guys).
Installation of Go (current guide):
```
sudo apt update && sudo apt upgrade -y
wget https://go.dev/dl/go1.23.5.linux-amd64.tar.gz
rm -rf /usr/local/go && tar -C /usr/local -xzf go1.23.5.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin
echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.profile
source ~/.profile
```
Build Caddy with Cloudflare-DNS
The official guide is here.
```
go install github.com/caddyserver/xcaddy/cmd/xcaddy@latest
sudo mv ~/go/bin/xcaddy /usr/local/bin/
xcaddy build --with github.com/caddy-dns/cloudflare
sudo mv caddy /usr/local/bin/
caddy version
```
Getting a Cloudflare API Key
To get the API key just follow the Cloudflare docs, I set mine with these permissions:
```
All zones - Zone:Read, SSL and Certificates:Edit, DNS:Edit
```
Here is also the official page for the Cloudflare-DNS Plugin.
```
export CF_API_TOKEN="your_cloudflare_api_token"
echo 'CF_API_TOKEN="your_cloudflare_api_token"' | sudo tee /etc/default/caddy > /dev/null
```
Caddyfile
I am using example domains!
```
(log_requests) {
	log {
		output file /var/log/caddy/access.log
		format json
	}
}

karlkarlkarl.de, *.karlkarlkarl.de {
	import log_requests

	tls {
		dns cloudflare {env.CLOUDFLARE_API_TOKEN}
	}

	header Content-Type "text/html"
	respond "Wrong!" 200
}

karlkarl.de, *.karlkarl.de {
	import log_requests

	tls {
		dns cloudflare {env.CLOUDFLARE_API_TOKEN}
	}

	header Content-Type "text/html"
	respond "Wrong!" 200
}
```
Running Caddy as a service
```
nano /etc/systemd/system/caddy.service
```
```
[Unit]
Description=Caddy Web Server
After=network.target

[Service]
User=caddy
Group=caddy
ExecStart=/usr/bin/caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
EnvironmentFile=/etc/default/caddy
AmbientCapabilities=CAP_NET_BIND_SERVICE
Restart=always
RestartSec=5s
LimitNOFILE=1048576

[Install]
WantedBy=multi-user.target
```
```
systemctl start caddy
systemctl enable caddy
systemctl status caddy
```
Everything should work if you closely followed the steps up until now. If not check the caddy.service and Caddyfile. To check logs use:
```
journalctl -u caddy --no-pager -n 50 -f
```
Just a heads-up—Caddy automatically redacts credentials in its logs, and getting it to not do that is kind of a pain.
```
{"level":"info","ts":1738162687.1416154,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_ip":"1.0.0.1","remote_port":"62128","client_ip":"1.0.0.1","proto":"HTTP/1.1","method":"GET","host":"api.karlkarlkarl.de","uri":"/api/resource","headers":{"User-Agent":["curl/8.7.1"],"Authorization":["REDACTED"],"Accept":["application/json"]}},"bytes_read":0,"user_id":"","duration":0.000052096,"size":0,"status":308,"resp_headers":{"Connection":["close"],"Location":["https://api.karlkarlkarl.de/login"],"Content-Type":[],"Server":["Caddy"]}}
```
```
"Authorization":["REDACTED"]
```
Lame for us 😒. If you want more control over logging, you can use any other server or even build your own. One day I might add this as a feature to my Node-RED-Team stack, including automatic Cloudflare settings via API, just add domain and go.
As I mentioned earlier, I had permission for this, and my scope didn’t allow me to grab actual credentials since they belonged to third parties using the service.
The most interesting things in these logs:
- Credentials
- IP addresses
- Paths
- Subdomains
- Cookies and tokens
That should be more than enough to hijack a session and dig up even more data—or at the very least, get some freebies.
Cloudflare – DNS & Mail
DNS
We’ll add some wildcard DNS records so that all subdomains get routed to our server—because let’s be real, we don’t know all the subdomains of our target.
Example of Wildcard DNS, best to set both, a normal A and Wildcard A. Point it to your IP.
It’s almost as good as having your own nameserver. Plus, Cloudflare gives you a ton of DNS logs. Sure, you won’t get all of them like you would with your own setup, but honestly… I don’t really care that much about DNS logs anyway.
SS/TLS Settings in Cloudflare
Make sure to check your SSL/TLS setting in Cloudflare to be “Full (strict)” otherwise Caddy and Clouflare will get stuck in a redirect loop and it is gonna take you forever to figure out that this is the issue, which will annoy you quite a bit.
Email
Set up email routing through Cloudflare—it’s easy, just two clicks. Then, you’ll need a catch-all email rule and a destination address.
This will forward all emails sent to the typo domain straight to your chosen domain.
Catch-All Email rule in Cloudflare Email Settings
You could set up your own mail server to do the same thing, which gives you more control over how emails are handled. But for my POC, I didn’t need the extra hassle.
I should mention that I set up an email flow to notify people that they sent their mail to the wrong address and that it was not delivered using n8n:
This post is already getting pretty long, so I might do a separate one about n8n another time. For now, just know that people were notified when they sent mail to the wrong address, and their important messages were delivered into the void.
Profit
By “profit,” I’m, of course, making a joke about the classic Step 1 → Step 2 → Step 3 → Profit meme—not actual profit. That would be illegal under American law, so let’s keep things legal and fun. Just thought I’d clarify 🫡.
Now, you wait. Check the logs now and then, peek at the emails occasionally. Like a fisherman (or fisherwoman), you sit back and see what bites.
How long does it take? Well, that depends on how good your typo is and how popular your target is—could be minutes, could be days.
For me, I was getting around 10-15 emails per day. The weblogs are mostly just people scanning the crap out of my server.
Email stats of the first 2 days for one of the domains (I hold 14)
Conclusion
I bought 14 domains with the most common typos for my target and ended up catching around 400 emails in a month —containing some of the most devastating info you could imagine.
I’m talking government documents, filled-out contracts, filed reports. I got people’s birth certificates, death certificates, addresses, signatures—you name it.
Think about it—when you email a government office, they already know everything about you, so you don’t think twice about sending them paperwork, right? Well… better triple-check that email address before you hit send, or guess what? It’s mine now.
As for weblogs, their real value comes in when a developer is testing a tool and mistypes a public domain. I didn’t manage to snag any API keys, but I guarantee that if your target has public APIs or a sprawling IT infrastructure, credentials will slip through eventually.
Defense
The only real defense is to buy all the typo domains before the bad guys do. There are services that specialize in this—if you’ve got the budget, use them.
If you can’t buy them, monitor them. Plenty of commercial tools can do this, or you can build your own. The easiest DIY approach would be to use dnstwist to generate typo variations and check WHOIS records or dig to see if anyone has registered them.
Typo domains aren’t just used for passive logging—people also host malicious content and phishing campaigns on them. That said, those methods get caught pretty fast. The approach I showed you is much more silent and in my opinion, dangerous. It doesn’t set off alarms right away.
Also, don’t bother scanning for typo domains with MX records—most registrars have catch-all rules, so that’s a dead end.
Domains are dirt cheap compared to the damage I could do if I decided to leak this to the press, extort people, or trick them into giving me money. You instantly gain trust because the emails you receive usually say things like “As we just discussed over the phone… or contain entire ongoing conversations.
This whole setup takes about an hour and costs maybe 50 bucks for some domains.
Anyway, thanks for reading. Good night, sleep tight, and don’t let the bed bugs bite.
Love you 😘
2025-01-31

The Privacy-Friendly Mail Parser You’ve Been Waiting For

As you may or may not know (but now totally do), I have another beloved website, Exploit.to. It’s where I let my inner coder run wild and build all sorts of web-only tools. I’ll save those goodies for another project post, but today, we’re talking about my Mail Parser—a little labor of love born from frustration and an overdose of caffeine.

See, as a Security Analyst and incident responder, emails are my bread and butter. Or maybe my curse. Parsing email headers manually? It’s a one-way ticket to losing your sanity. And if you’ve ever dealt with email headers, you know they’re basically the Wild West—nobody follows the rules, everyone’s just slapping on whatever they feel like, and chaos reigns supreme.

The real kicker? Every single EML parser out there at the time was server-side. Let me paint you a picture: you, in good faith, upload that super-sensitive email from your mom (the one where she tells you your laundry’s done and ready for pick-up) to some rando’s sketchy server. Who knows what they’re doing with your mom’s loving words? Selling them? Training an AI to perfect the art of passive-aggressive reminders? The horror!

So, I thought, “Hey, wouldn’t it be nice if we had a front-end-only EML parser? One that doesn’t send your personal business to anyone else’s server?” Easy peasy, right? Wrong. Oh, how wrong I was. But I did it anyway.

You can find the Mail Parser here and finally parse those rogue headers in peace. You’re welcome.

Technologies

React: Handles the user interface and dynamic interactions.
Astro.js: Used to generate the static website efficiently. (technically not needed for this project)
TailwindCSS: For modern and responsive design.
ProtonMail’s jsmimeparser: The core library for parsing email headers.

When I first approached this project, I tried handling email header parsing manually with regular expressions. It didn’t take long to realize how complex email headers have become, with an almost infinite variety of formats, edge cases, and inconsistencies. Regex simply wasn’t cutting it.

That’s when I discovered ProtonMail’s jsmimeparser, a library purpose-built for handling email parsing. It saved me from drowning in parsing logic and ensured the project met its functional goals.

Sharing the output of this tool without accidentally spilling personal info all over the place is kinda tricky. But hey, I gave it a shot with a simple empty email I sent to myself:

The Code

As tradition dictates, the code isn’t on GitHub but shared right here in a blog post 😁.

Kidding (sort of). The repo is private, but no gatekeeping here—here’s the code:

mailparse.tsx

import React, { useState } from "react";
import { parseMail } from "@protontech/jsmimeparser";

type Headers = {
  [key: string]: string[];
};

const MailParse: React.FC = () => {
  const [headerData, setHeaderData] = useState<Headers>({});
  const [ioc, setIoc] = useState<any>({});

  function extractEntitiesFromEml(emlContent: string) {
    const ipRegex =
      /\b(?:\d{1,3}\.){3}\d{1,3}\b|\b(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}\b/g;
    const emailRegex = /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g;
    const urlRegex = /(?:https?|ftp):\/\/[^\s/$.?#].[^\s]*\b/g;
    const htmlTagsRegex = /<[^>]*>/g; // Regex to match HTML tags

    // Match IPs, emails, and URLs
    const ips = Array.from(new Set(emlContent.match(ipRegex) || []));
    const emails = Array.from(new Set(emlContent.match(emailRegex) || []));
    const urls = Array.from(new Set(emlContent.match(urlRegex) || []));

    // Remove HTML tags from emails and URLs
    const cleanEmails = emails.map((email) => email.replace(htmlTagsRegex, ""));
    const cleanUrls = urls.map((url) => url.replace(htmlTagsRegex, ""));

    return {
      ips,
      emails: cleanEmails,
      urls: cleanUrls,
    };
  }

  function parseDKIMSignature(signature: string): Record<string, string> {
    const signatureParts = signature.split(";").map((part) => part.trim());
    const parsedSignature: Record<string, string> = {};

    for (const part of signatureParts) {
      const [key, value] = part.split("=");
      parsedSignature[key.trim()] = value.trim();
    }

    return parsedSignature;
  }

  const handleFileChange = async (
    event: React.ChangeEvent<HTMLInputElement>
  ) => {
    const file = event.target.files?.[0];
    if (!file) return;

    const reader = new FileReader();
    reader.onload = async (e) => {
      const buffer = e.target?.result as ArrayBuffer;

      // Convert the buffer to a string
      const bufferArray = Array.from(new Uint8Array(buffer)); // Convert Uint8Array to number[]
      const bufferString = String.fromCharCode.apply(null, bufferArray);

      const { attachments, body, subject, from, to, date, headers, ...rest } =
        parseMail(bufferString);

      setIoc(extractEntitiesFromEml(bufferString));
      setHeaderData(headers);
    };

    reader.readAsArrayBuffer(file);
  };

  return (
    <>
      <div className="p-4">
        <h1>Front End Only Mailparser</h1>
        <p className="my-6">
          Have you ever felt uneasy about uploading your emails to a server you
          don't fully trust? I sure did. It's like handing over your private
          correspondence to a stranger. That's why I decided to take matters
          into my own hands.
        </p>
        <p className="mb-8">
          With this frontend-only mail parser, there's no need to worry about
          your privacy. Thanks to{" "}
          <a
            href="https://proton.me/"
            className="text-pink-500 underline dark:visited:text-gray-400 visited:text-gray-500 hover:font-bold after:content-['_↗']"
          >
            ProtonMail's
          </a>{" "}
          <a
            className="text-pink-500 underline dark:visited:text-gray-400 visited:text-gray-500 hover:font-bold after:content-['_↗']"
            href="https://github.com/ProtonMail/jsmimeparser"
          >
            jsmimeparser
          </a>
          , you can enjoy the same email parsing experience right in your
          browser. No more sending your sensitive data to external servers.
          Everything stays safe and secure, right on your own system.
        </p>

        <input
          type="file"
          onChange={handleFileChange}
          className="block w-full text-sm text-slate-500
      file:mr-4 file:py-2 file:px-4
      file:rounded-full file:border-0
      file:text-sm file:font-semibold
      file:bg-violet-50 file:text-violet-700
      hover:file:bg-violet-100
    "
        />

        {Object.keys(headerData).length !== 0 && (
          <table className="mt-8">
            <thead>
              <tr className="border dark:border-white border-black">
                <th>Header</th>
                <th>Value</th>
              </tr>
            </thead>
            <tbody>
              {Object.entries(headerData).map(([key, value]) => (
                <tr key={key} className="border dark:border-white border-black">
                  <td>{key}</td>
                  <td>{value}</td>
                </tr>
              ))}
            </tbody>
          </table>
        )}
      </div>

      {Object.keys(ioc).length > 0 && (
        <div className="mt-8">
          <h2>IPs:</h2>
          <ul>
            {ioc.ips && ioc.ips.map((ip, index) => <li key={index}>{ip}</li>)}
          </ul>
          <h2>Emails:</h2>
          <ul>
            {ioc.emails &&
              ioc.emails.map((email, index) => <li key={index}>{email}</li>)}
          </ul>
          <h2>URLs:</h2>
          <ul>
            {ioc.urls &&
              ioc.urls.map((url, index) => <li key={index}>{url}</li>)}
          </ul>
        </div>
      )}
    </>
  );
};

export default MailParse;

Yeah, I know, it looks kinda ugly as-is—but hey, slap it into VSCode and let the prettifier work its magic.

Most of the heavy lifting here is courtesy of the library I used. The rest is just some plain ol’ regex doing its thing—filtering for indicators in the email header and body to make life easier for further investigation.

Conclusion

Short and sweet—that’s the vibe here. Sometimes, less is more, right? Feel free to use this tool wherever you like—internally, on the internet, or even on a spaceship. You can also try it out anytime directly on my website.

Don’t trust me? Totally fair. Open the website, yank out your internet connection, and voilà—it still works offline. No sneaky data sent to my servers, pinky promise.

As for my Astro.js setup, I include the “mailparse.tsx” like this:

---
import BaseLayout from "../../layouts/BaseLayout.astro";
import Mailparse from "../../components/mailparse";
---

<BaseLayout>
  <Mailparse client:only="react" />
</BaseLayout>

See you on the next one. Love you, byeeeee ✌️😘

2025-01-15

KarlGPT – My Push to Freedom
KarlGPT represents my pursuit of true freedom, through AI. I’ve realized that my ultimate life goal is to do absolutely nothing. Unfortunately, my strong work ethic prevents me from simply slacking off or quietly quitting.
This led me to the conclusion that I need to maintain, or even surpass, my current level of productivity while still achieving my dream of doing nothing. Given the advancements in artificial intelligence, this seemed like a solvable problem.
I began by developing APIs to gather all the necessary data from my work accounts and tools. Then, I started working on a local AI model and server to ensure a secure environment for my data.
Now, I just need to fine-tune the entire system, and soon, I’ll be able to automate my work life entirely, allowing me to finally live my dream: doing absolutely nothing.
This is gonna be a highly censored post as it involves certain details about my work I can not legally disclose
Technologies
Django and Django REST Framework (DRF)
Django served as the backbone for the server-side logic, offering a robust, scalable, and secure foundation for building web applications. The Django REST Framework (DRF) made it simple to expose APIs with fine-grained control over permissions, serialization, and views. DRF’s ability to handle both function-based and class-based views allowed for a clean, modular design, ensuring the APIs could scale as the project evolved.
- Django Documentation: https://docs.djangoproject.com
- Django REST Framework Docs: https://www.django-rest-framework.org
Celery Task Queue
To handle asynchronous tasks such as sending emails, performing background computations, and integrating external services (AI APIs), I implemented Celery. Celery provided a reliable and efficient way to manage long-running tasks without blocking the main application. This was critical for tasks like scheduling periodic jobs and processing user-intensive data without interrupting the API’s responsiveness.
- Celery Official Documentation: https://docs.celeryproject.org
React with TypeScript and TailwindCSS
For the frontend, I utilized React with TypeScript for type safety and scalability. TypeScript ensured the codebase remained maintainable as the project grew. Meanwhile, TailwindCSS enabled rapid UI development with its utility-first approach, significantly reducing the need for writing custom CSS. Tailwind’s integration with React made it seamless to create responsive and accessible components.
This is my usual front end stack, usually also paired with Astrojs. I use regular React, no extra framework.
- React Official Docs: https://reactjs.org
- TypeScript Docs: https://www.typescriptlang.org
- TailwindCSS Documentation: https://tailwindcss.com
Vanilla Python
Due to restrictions that prohibited the use of external libraries in local API wrappers, I had to rely on pure Python to implement APIs and related tools. This presented unique challenges, such as managing HTTP requests, data serialization, and error handling manually. Below is an example of a minimal API written without external dependencies:
```
import re
import json
from http.server import BaseHTTPRequestHandler, HTTPServer


items = {"test": "mewo"}


class ControlKarlGPT(BaseHTTPRequestHandler):
    def do_GET(self):
        if re.search("/api/helloworld", self.path):
            self.send_response(200)
            self.send_header("Content-type", "application/json")
            self.end_headers()
            response = json.dumps(items).encode()
            self.wfile.write(response)
        else:
            self.send_response(404)
            self.end_headers()
            
def run(server_class=HTTPServer, handler_class=ControlKarlGPT, port=8000):
    server_address = ("", port)
    httpd = server_class(server_address, handler_class)
    print(f"Starting server on port http://127.0.0.1:{port}")
    httpd.serve_forever()


if __name__ == "__main__":
    run()
```
- Python Official Documentation: https://docs.python.org
By weaving these technologies together, I was able to build a robust, scalable system that adhered to the project’s constraints while still delivering a polished user experience. Each tool played a crucial role in overcoming specific challenges, from frontend performance to backend scalability and compliance with restrictions.
File based Cache
To minimize system load, I developed a lightweight caching framework based on a simple JSON file-based cache. Essentially, this required creating a “mini-framework” akin to Flask but with built-in caching capabilities tailored to the project’s needs. While a pull-based architecture—where workers continuously poll the server for new tasks—was an option, it wasn’t suitable here. The local APIs were designed as standalone programs, independent of a central server.
This approach was crucial because some of the tools we integrate lack native APIs or straightforward automation options. By building these custom APIs, I not only solved the immediate challenges of this project (e.g., powering KarlGPT) but also created reusable components for other tasks. These standalone APIs provide a solid foundation for automation and flexibility beyond the scope of this specific system
How it works
The first step was to identify what tasks I perform in the daily and the tools I use for each of them. To automate anything effectively, I needed to abstract these tasks into programmable actions. For example:
- Read Emails
- Respond to Invitations
- Check Tickets
Next, I broke these actions down further to understand the decision-making process behind each. For instance, when do I respond to certain emails, and how do I determine which ones fall under my responsibilities? This analysis led to a detailed matrix that mapped out every task, decision point, and tool I use.
The result? A comprehensive, structured overview of my workflow. Not only did this help me build the automation framework, but it also provided a handy reference for explaining my role. If my boss ever asks, “What exactly do you do here?” I can present the matrix and confidently say, “This is everything.”
As you can see, automating work can be a lot of work upfront—an investment in reducing effort in the future. Ironically, not working requires quite a bit of work to set up! 😂
The payoff is a system where tasks are handled automatically, and I have a dashboard to monitor, test, and intervene as needed. It provides a clear overview of all ongoing processes and ensures everything runs smoothly:
AI Magic: Behind the Scenes
The AI processing happens locally using Llama 3, which plays a critical role in removing all personally identifiable information (PII) from emails and text. This is achieved using a carefully crafted system prompt fine-tuned for my specific job and company needs. Ensuring sensitive information stays private is paramount, and by keeping AI processing local, we maintain control over data security.
In most cases, the local AI is fully capable of handling the workload. However, for edge cases where additional computational power or advanced language understanding is required, Claude or ChatGPT serve as backup systems. When using cloud-based AI, it is absolutely mandatory to ensure that no sensitive company information is disclosed. For this reason, the system does not operate in full-auto mode. Every prompt is reviewed and can be edited before being sent to the cloud, adding an essential layer of human oversight.
To manage memory and task tracking, I use mem0 in conjunction with a PostgreSQL database, which acts as the system’s primary “brain” 🧠. This database, structured using Django REST Framework, handles everything from polling for new tasks to storing results. This robust architecture ensures that all tasks are processed efficiently while maintaining data integrity and security.
Conclusion
Unfortunately, I had to skip over many of the intricate details and creative solutions that went into making this system work. One of the biggest challenges was building APIs around legacy tools that lack native automation capabilities. Bringing these tools into the AI age required innovative thinking and a lot of trial and error.
The preparation phase was equally demanding. Breaking down my daily work into a finely detailed matrix took time and effort. If you have a demanding role, such as being a CEO, it’s crucial to take a step back and ask yourself: What exactly do I do? A vague answer like “represent the company” won’t cut it. To truly understand and automate your role, you need to break it down into detailed, actionable components.
Crafting advanced prompts tailored to specific tasks and scenarios was another key part of the process. To structure these workflows, I relied heavily on frameworks like CO-START and AUTOMAT (stay tuned for an upcoming blog post about these).
I even created AI personas for the people I interact with regularly and designed test loops to ensure the responses generated by the AI were accurate and contextually appropriate. While I drew inspiration from CrewAI, I ultimately chose LangChain for most of the complex workflows because its extensive documentation made development easier. For simpler tasks, I used lightweight local AI calls via Ollama.
This project has been an incredible journey of challenges, learning, and innovation. It is currently in an early alpha stage, requiring significant manual intervention. Full automation will only proceed once I receive explicit legal approval from my employer to ensure compliance with all applicable laws, company policies, and data protection regulations.
Legal Disclaimer: The implementation of any automation or AI-based system in a workplace must comply with applicable laws, organizational policies, and industry standards. Before deploying such systems, consult with legal counsel, relevant regulatory bodies, and your employer to confirm that all requirements are met. Unauthorized use of automation or AI may result in legal consequences or breach of employment contracts. Always prioritize transparency, data security, and ethical considerations when working with sensitive information.
2025-01-15
Locking Down the Web: How I Secure My WordPress and Other Self-Hosted Public Sites
Securing a WordPress hosting setup requires more than just the basics—it’s about creating a layered defense to protect your server and adapt to emerging threats. Today I am going to show you what I do to keep Karlcom hosted systems secure from outside attackers.
Firewall Restriction
To minimize exposure, my server only accepts traffic from Cloudflare’s IP ranges and only on port 443. This ensures that attackers cannot directly access my server’s IP address, significantly reducing the attack surface.
On my Firewall it looks like this:
- One rule to allow Cloudflare
- One to allow my server to come back in from the internet
- One block all rule for anything else
This works pretty well so far.
Cloudflare’s Web Application Firewall (WAF)
I leverage Cloudflare’s free WAF to filter out malicious traffic before it reaches my server. It’s an effective first line of defense that helps block known attack patterns and suspicious behavior.
Here you can find some more Information about it.
I felt kind of weird sharing my WAF rules here, since you know people reading this can use them to build scans that get around but I figured, I am up for the challenge so lets go:
```
(starts_with(http.request.full_uri, "https://karl.fail//xmlrpc.php")) or (starts_with(http.request.full_uri, "https://karl.fail/xmlrpc.php")) or (ends_with(http.request.uri, "xmlrpc.php")) or (http.request.full_uri contains "setup-config.php") or (http.request.full_uri contains "wp-admin/install.php") or (http.request.uri.path wildcard r"//*")
```
This is pretty WordPress specific, I know you can set these on your reverse proxy as well as your wordpress server as well, but I figured letting Cloudflare handle it with their admittedly much more powerful server and taking some steam off of mine would be a good thing to do.
EDIT:
While writing this post attacks changed a little and I got some really annoying scans from some IP ranges that all came from Russia, so I ended up Rick Rolling all Russian IPs trying to get through to my home network. Nothing personal.
Continuous Monitoring with Grafana Labs Loki
Despite these measures, some scanners and attackers still manage to slip through. To address this, I use Grafana Labs Loki to analyze server logs. By identifying suspicious activity or unusual access paths, I can create new Cloudflare WAF rules to block emerging threats proactively.
Here you can see some scans from the outside that made it through. I have since updated the WAF rules to block them as well.
Updates
As I mentioned in my post about backing up data, I automate the updates for all my LXCs, VMs, and container images. While this approach does carry the risk of introducing breaking changes, the time and effort saved by automating these updates outweigh the potential downsides for me at this stage. Manual maintenance just isn’t practical for my setup right now.
Since I do daily backups I can recover real fast.
The Cycle of Security
This process of monitoring, analyzing, and refining creates an ongoing cycle of security improvements. It’s a proactive and dynamic approach that keeps my server well-protected against evolving threats.
If you’re using a similar setup or have additional tips for securing WordPress hosting, I’d love to hear your thoughts. Sharing strategies and experiences is one of the best ways to stay ahead of attackers.
That said, I’m genuinely curious if any attackers reading this will now take it as a challenge to get around my defenses. For that very reason, I stay vigilant, regularly auditing my Grafana logs at home. Security is a constant effort, and in my case, we have SIEM at home, son!
2025-01-13
My Home Server: “PrettyLittleKitten” – A Personal Tech Haven
Hardware
RAM
My server is equipped with a total of 128GB of DDR5 RAM, made up of two Kingston FURY Beast kits (each consisting of 2 x 32GB, 6000 MHz, DDR5-RAM, DIMM).
The RAM operates at around 3600 MHz and consistently maintains 32GB in active usage:
Cooling
I kept the fan coolers as they were and opted for an all-in-one liquid cooling solution: the Arctic Liquid Freezer III – 280. No particular reason, really—I just thought it was a cool choice (pun intended).
PC Case
This setup was originally intended to be a gaming-only PC, so I chose a sleek and clean-looking case: the Fractal Design North XL. While it’s an aesthetically pleasing choice, the one downside for use as a server is its limited storage capacity.
CPU
I chose the AMD Ryzen 7 7800X3D (AM5, 4.20 GHz, 8-Core), which is fantastic for gaming. However, as a server for my needs, I regret that it doesn’t have a better built-in GPU. Intel’s iGPUs are far superior for media transcoding, and using an integrated GPU instead of an external one would save a significant amount of energy.
GPU
I do have a dedicated GPU, the ASUS TUF Gaming AMD Radeon RX 7900 XTX OC Edition 24GB, which I chose primarily for its massive VRAM. This allows me to run larger models locally without any issues. However, when it comes to media transcoding, AMD GPUs fall short compared to other options, as highlighted in the Jellyfin – Selecting Appropriate Hardware guide.
Mainboard
I chose the MSI MAG B650 TOMAHAWK WIFI (AM5, AMD B650, ATX) as it seemed like a great match for the CPU. However, my GPU is quite large and ends up covering the only other PCI-E x16 slot. This limits my ability to install a decent hardware RAID card or other large expansion cards.
Storage
For the main OS, I selected the WD_BLACK SN770 NVMe SSD 2 TB, providing fast and reliable performance. To handle backups and media storage, I added a Seagate IronWolf 12 TB (3.5”, CMR) drive.
For fast and redundant storage, I set up a ZFS mirror using two Intenso Internal 2.5” SSD SATA III Top, 1 TB drives. This setup ensures that critical data remains safe and accessible.
Additionally, I included an external Samsung Portable SSD T7, 1 TB, USB 3.2 Gen.2 for extra media storage, rounding out the setup.
Software
For my main OS, I stick to what I know best—Proxmox. It’s absolutely perfect for home or small business servers, offering flexibility and reliability in a single package.
I run a variety of services on it, and the list tends to evolve weekly. Here’s what I’m currently hosting:
- Nginx Proxy Manager: For managing reverse proxies.
- n8n: Automation tool for workflows.
- Bearbot: A production-grade Django app.
- Vaultwarden: A lightweight password manager alternative.
- MySpeed: Network speed monitoring.
- Another Nginx Proxy Manager: Dedicated to managing public-facing apps.
- Code-Server: A browser-based IDE for developing smaller scripts.
- Authentik: Single-Sign-On (SSO) solution for all local apps.
- WordPress: This blog is hosted here.
- Logs: A comprehensive logging stack including Grafana, Loki, Rsyslog, Promtail, and InfluxDB.
- Home Assistant OS: Smart home management made easy.
- Windows 11 Gaming VM: For gaming and other desktop needs.
- Karlflix: A Jellyfin media server paired with additional tools to keep my media library organized.
And this list is far from complete—there’s always something new to add or improve!
Performance
The core allocation may be displayed incorrectly, but otherwise, this is how my setup looks:
Here’s the 8-core CPU usage over the last 7 days. As you can see, there’s plenty of headroom, ensuring the system runs smoothly even with all the services I have running:
Energy costs for my server typically range between 20-25€ per month, but during the summer months, I can run it at 100% capacity during the day using the solar energy generated by my panels. My battery also helps offset some of the power usage during this time.
Here’s a solid representation of the server’s power consumption:
I track everything in my home using Home Assistant, which allows me to precisely calculate the energy consumption of each device, including my server. This level of monitoring ensures I have a clear understanding of where my energy is going and helps me optimize usage effectively.
Conclusion
Hosting a server locally is a significant investment—both in terms of hardware and energy costs. My setup cost €2405, and I spend about €40 per month on energy, including domain and backup services. While my solar panels make running the server almost free during summer, winter energy costs can be a challenge.
That said, hosting locally has its advantages. It provides complete control over my data, excellent performance, and the flexibility to upgrade or downgrade hardware as needed. These benefits outweigh the trade-offs for me, even though the energy consumption is higher compared to a Raspberry Pi or Mini-PC.
I could have gone a different route. A cloud server, or even an alternative like the Apple Mac Mini M4, might have been more efficient in terms of cost and power usage. However, I value upgradability and privacy too much to make those sacrifices.
This setup wasn’t meticulously planned as a server from the start—it evolved from a gaming PC that was sitting unused. Instead of building a dedicated server from scratch or relying on a Mini-PC and NAS combination, I decided to repurpose what I already had.
Sure, there are drawbacks. The fans are loud, energy costs add up, and it’s far from the most efficient setup. But for me, the flexibility, control, and performance make it worthwhile. While hosting locally might not be the perfect solution for everyone, it’s the right choice for my needs—and I think that’s what really matters.
2025-01-13
Inception-Level Data Safety: Backing Up Your Proxmox Backups with Borg on Hetzner
Today, I want to walk you through how I handle backups for my home server. My primary method is using Proxmox’s built-in backup functionality, which I then sync to a Hetzner Storage Box for added security.
When it comes to updates, I like to live on the edge. I enable automatic (security) updates on nearly all of my systems at home using UnattendedUpgrades. For containers, I usually deploy a Watchtower instance to keep them updated automatically. While this approach might make some people nervous—fearing a broken system after an update—I don’t sweat it. I back up daily and don’t run any mission-critical systems at home (except for this blog, of course 😉).
For specific files or directories, like Vaultwarden, I take an extra layer of precaution by creating additional backups within the LXC container itself. These backups are synced to a Nextcloud instance I also manage through Hetzner, but in a different datacenter. Hetzner’s “Storage Shares” offer a great deal—€5 gets you 1TB of managed Nextcloud storage. While not the fastest, they’re reliable enough for my needs.
I won’t dive into the details here, but my approach for these backups is pretty straightforward: I use ZIP files and rclone to upload everything to Nextcloud.
Here is my script, maybe it helps you in some way:
```
#!/bin/bash

# Variables
BITWARDEN_DIR="/root/bitwarden"
BACKUP_DIR="/root/bitwarden-backup"
NEXTCLOUD_REMOTE="nextcloud:Vaultwarden"
TIMESTAMP=$(date '+%Y%m%d-%H%M')

# Ensure backup directory exists
mkdir -p $BACKUP_DIR

# Create a single tarball of the entire Vaultwarden directory
echo "Creating a full backup of the Vaultwarden directory..."
tar -czvf $BACKUP_DIR/vaultwarden_full_backup-${TIMESTAMP}.tar.gz -C $BITWARDEN_DIR .

# Sync the backup to Nextcloud
echo "Uploading backup to Nextcloud..."
rclone copy $BACKUP_DIR $NEXTCLOUD_REMOTE

# Clean up local backup directory
echo "Cleaning up local backups..."
rm -rf $BACKUP_DIR

echo "Backup completed successfully!"
```
Basically, all you need to do is create an App Password and follow the Rclone guide for setting up with WebDAV. It’s straightforward and works seamlessly for this kind of setup.
Backups in Proxmox
Proxmox makes backups incredibly simple with its intuitive functionality. I back up pretty much everything—except for my Gaming VM. It’s a Windows 11 experiment where I’ve passed through my AMD RX7900XT for gaming. Ironically, instead of gaming, I end up spending more time tweaking backups and writing about them. Let’s just say that gaming setup hasn’t exactly gone as planned.
I rely on Snapshot mode for my backups, and you can explore all its features and settings right here. As I mentioned earlier, I tend to restore backups more frequently than most people, and I’ve never faced any issues with inconsistencies. It’s been consistently reliable for me!
For retention, I keep it straightforward by saving only the last two backups. Since I also back up my backups (as you’ll see later), this minimalist approach is more than sufficient for my needs and saves me some space.
I left the rest of the settings as they are. The note templates are useful if you’re managing a large or multiple instances, but for my setup, I don’t use them.
Trigger warning: For now, I’m storing these backups on a single internal Seagate IronWolf (12 TB). I know, not ideal. These drives are pretty pricey, but one day I plan to add another and set up a ZFS mirror or RAID for better redundancy. For now, I’m relying on this one drive—fingers crossed, it’s been rock solid so far!
Borg(Backup)
The first thing I heard when I proudly told my friends that I was finally taking the golden 3-2-1 backup rule seriously was: “Why not restic?”
The simple answer? I Googled “backups to Hetzner Storage Box,” and the first result was an article explaining exactly what I wanted to do—using Borg 🤷‍♂️. Before I even considered trying restic, I had already set up encrypted incremental backups with Borg. Feel free to share what you use and why you might have switched, but for now, this setup works perfectly for me!
Hetzner Storage Box
Just to clarify, I’m not talking about Hetzner Storage Share 😁. I’m using their 5TB Storage Box and opted for Finland 🇫🇮 as the location since I already have other Karlcom-related stuff in their German datacenter. It helps keep things spread out a bit!
Essentially, it’s a big, affordable storage backend with multiple options for uploading data. You could mount it using the “Samba/CIFS” option, but I decided against that. Instead, I went with a more secure SSH connection to send my backups there.
Setup
First, you’ll need to upload your SSH key to the Hetzner Storage Box. You can follow this step by step guide.
Once that’s done, the next step is to install and Configure BorgBackup, which you can also follow the simple guide I linked to.
I know, it seems like you came here just to find links to set this up somewhere else. But don’t worry—I’ve got some cool stuff to share with you next. Here’s my backup script:
/usr/local/bin/proxmox_borg_backup.sh
```
#!/bin/bash

# Variables
BORG_REPO="ssh://[email protected]:23/home/backups/central"

BORG_PASSPHRASE=''
BACKUP_SOURCE="/mnt/pve/wd_hdd_internal/dump"                               
LOG_FILE="/var/log/proxmox_borg_backup.log"                                 
MAX_LOG_SIZE=10485760
RID=`uuidgen`
CHECK_ID="ggshfo8-9ca6-1234-1234-326571681"

# start
curl -fsS -m 10 --retry 5 "https://ping.yourdomain.de/ping/$CHECK_ID/start?rid=$RID"

# Export Borg passphrase
export BORG_PASSPHRASE

# Rotate log file if it exceeds MAX_LOG_SIZE
if [ -f "$LOG_FILE" ] && [ $(stat -c%s "$LOG_FILE") -gt $MAX_LOG_SIZE ]; then
    mv "$LOG_FILE" "${LOG_FILE}_$(date +"%Y-%m-%d_%H-%M-%S")"
    touch "$LOG_FILE"
fi

# Check for BorgBackup installation
if ! command -v borg &> /dev/null; then
    echo "ERROR: BorgBackup is not installed or not in PATH." >> "$LOG_FILE"
    exit 1
fi

# Check for SSH connection
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=5 -p 23 -i ~/.ssh/backup u123456@ u123456.your-storagebox.de exit; then
    echo "ERROR: Unable to connect to Borg repository." >> "$LOG_FILE"
    exit 1
fi

# Logging start time
{
  echo "==== $(date +"%Y-%m-%d %H:%M:%S") Starting Proxmox Backup ===="

  # Check if the backup source exists
  if [ ! -d "$BACKUP_SOURCE" ]; then
      echo "ERROR: Backup source directory $BACKUP_SOURCE does not exist!"
      exit 1
  fi

  # Create a new Borg backup
  echo "Creating Borg backup..."
  borg create --stats --compression zstd \
      "$BORG_REPO::backup-{now:%Y-%m-%d}" \
      "$BACKUP_SOURCE" >> "$LOG_FILE" 2>&1


  if [ $? -ne 0 ]; then
      echo "ERROR: Borg backup failed!"
      exit 1
  fi

  # Prune old backups to save space
  echo "Pruning old backups..."
  borg prune --stats \
      --keep-daily=7 \
      --keep-weekly=4 \
      --keep-monthly=6 \
      "$BORG_REPO"

  if [ $? -ne 0 ]; then
      echo "ERROR: Borg prune failed!"
      exit 1
  fi

  echo "==== $(date +"%Y-%m-%d %H:%M:%S") Proxmox Backup Completed ===="
} >> "$LOG_FILE" 2>&1

# finished
curl -fsS -m 10 --retry 5 "https://ping.yourdomain.de/ping/$CHECK_ID?rid=$RID"
```
The curl requests at the top and bottom of the script are for my Healthchecks.io instance—I even wrote a blog post about it here.
Before moving on, you should definitely test this script. Depending on the size of your setup, the initial backup could take several hours. However, if it doesn’t fail within the first 10 seconds, that’s usually a good sign. To be sure it’s running smoothly, check the log file to confirm it started correctly:
/var/log/proxmox_borg_backup.log
```
==== 2025-01-10 01:39:07 Starting Proxmox Backup ====
Creating Borg backup...
------------------------------------------------------------------------------
Repository: ssh://u123456@ u123456.your-storagebox.de:23/home/backups/central
Archive name: backup-2025-01-10
Archive fingerprint: z724gf2789hgf972hf9uh...
Time (start): Fri, 2025-01-10 01:39:08
Time (end):   Fri, 2025-01-10 05:36:41
Duration: 3 hours 57 minutes 32.92 seconds
Number of files: 72
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               62.03 GB             61.98 GB             61.60 GB
All archives:               62.03 GB             61.98 GB             61.60 GB

                       Unique chunks         Total chunks
Chunk index:                   24030                40955
------------------------------------------------------------------------------
Pruning old backups...
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
Deleted data:                    0 B                  0 B                  0 B
All archives:               62.03 GB             61.98 GB             61.60 GB

                       Unique chunks         Total chunks
Chunk index:                   24030                40955
------------------------------------------------------------------------------
==== 2025-01-10 05:36:42 Proxmox Backup Completed ====
```
Security of BORG_PASSPHRASE
I decided to include the passphrase for encryption and decryption directly in the script because it fits within my threat model. My primary concern isn’t someone gaining access to my local Proxmox server and restoring or deleting my backups—my focus is on protecting against snooping by cloud providers or malicious admins.
Having the passphrase in the script works for me. Sure, there are other ways to handle this, but for the script to run automatically, you’ll always need to store the passphrase somewhere on your system. At the very least, it has to be accessible by root. This setup strikes the right balance for my needs.
Systemd timers
I created a system service to handle this backup process. For long-running jobs, it’s generally better to use systemd timers instead of cron, as they’re less prone to timeouts. I found this post particularly helpful when setting it up.
Here’s the service that actually runs my bash script:
/etc/systemd/system/proxmox_borg_backup.service
```
[Unit]
Description=Proxmox BorgBackup Service
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/proxmox_borg_backup.sh
```
And here’s the systemd timer that handles scheduling the service:
/etc/systemd/system/proxmox_borg_backup.timer
```
[Unit]
Description=Run Proxmox BorgBackup Daily at 3 AM

[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true

[Install]
WantedBy=timers.target
```
Now, instead of enabling the service directly, you enable and start the timer. The timer will take care of starting the service according to the schedule you’ve defined. This setup ensures everything runs smoothly and on time!
Bash
```
systemctl enable proxmox_borg_backup.timer
systemctl start proxmox_borg_backup.timer 
systemctl status proxmox_borg_backup.timer
```
That’s it! You’re all set. You can check the log file we created or use the journalctl command to review any errors or confirm successful runs. Happy backing up! 🎉
Bash
```
journalctl -xeu proxmox_borg_backup.timer

# or 

tail -n 50 /var/log/proxmox_borg_backup.log
```
Conclusion
You should now have an easy and efficient solution to back up your Proxmox backups to a Hetzner Storage Box using Borg Backup. Both Borg and Restic support a variety of storage targets, so you can adapt this approach to suit your needs. In my setup, Borg performs incremental backups, uploading only new data, which helps keep storage costs low while maintaining security.
A word of caution: don’t lose your secrets—your encryption key or passphrase—because without them, you won’t be able to restore your data. Trust me, I’ve been there before! Thankfully, I had local backups to fall back on.
On Hetzner, I schedule daily backups at noon, after all my backup jobs have completed. I retain only the last three days, which works perfectly for me, though your needs might differ. Just remember that snapshot storage counts toward your total storage capacity—so if you have 1TB, the space used by snapshots will reduce the available storage for new data.
Thank you for reading! May your backups always be safe, your disks last long, and your systems run smoothly. Wishing you all the best—love you, byeeeeee! ❤️🚀
2025-01-11
Before “PrettyLittleKitten”: A Brief Affair with the Mac Mini M4
Before I built my beloved server, affectionately named “PrettyLittleKitten“, I had a brief fling with the brand-new Mac Mini M4. Spoiler alert: it was a short-lived relationship.
Let me start with the good stuff: processing power-to-power usage ratio. It’s absolutely unmatched. The Mac Mini M4 is a beast in terms of efficiency—an essential factor for me. I wanted hardware that could handle Jellyfin with smooth hardware acceleration while still hosting all my containers.
The Hardware
On paper (and in practice as a desktop), the Mac Mini M4 shines. It offers:
- 4 Thunderbolt USB-C ports, making storage expansion a breeze. Pair it with an external NVMe enclosure, and you can achieve speeds close to that of internal storage.
- Hardware that punches way above its price point, making it a reasonable investment for many use cases.
The Disappointment
Here’s where the romance fell apart. While the Mac Mini M4 is brilliant as a desktop, using it as a server is a whole different ball game—and not a fun one.
The iCloud Conundrum
First up: the dreaded iCloud account requirement. This wasn’t a total shock (it’s Apple, after all), but it made me long for the simplicity of Debian and Proxmox, where everything is blissfully offline.
I went ahead and set it up with my personal iCloud account—big mistake. To run the Mac Mini as I wanted, it needed to stay logged in indefinitely. And here’s the kicker: to achieve that, I had to disable authentication entirely. Translation? If anyone got their hands on my Mini, they’d have full access to my iCloud account. Yikes.
Pro tip: Use a burner iCloud account if you’re planning to go down this route. (Is this what you want, Apple?!)
Dummy HDM
Then there’s the issue of fooling the Mac into thinking it’s doing desktop work. Without a connected display, macOS doesn’t fully utilize the GPU or cores, which impacts performance. Enter the Dummy HDMI Plug—a little device to trick the system into thinking a monitor is attached. At ~€40, it’s not a dealbreaker, but definitely annoying.
Power Saving Woes
You’ll also need to disable power-saving features. While the Mac Mini M4 consumes very little power in idle, turning off power-saving negates some of its efficiency benefits.
Recap of Mac Mini Server Challenges
If you’re still tempted to use the Mac Mini M4 as a server, here’s your checklist:
- Dummy HDMI Plug: €40 (because macOS needs to “see” a monitor).
- Burner iCloud Account: Necessary to avoid risking your real account.
- Disable Authentication: Say goodbye to security.
- Disable Power Saving: Because macOS doesn’t believe in idle servers.
Final Thoughts
If you’re determined, Evan Bartlett has written an excellent guide on setting up the Mac Mini as a server. However, as someone coming from the Linux world—where operating systems are designed for server use—it just didn’t feel right. Forcing macOS, an OS that clearly does not want to be a server, felt morally and ethically wrong.
Here’s hoping Big Siri AI will be kind to me when it inevitably takes over. 🙇‍♂️🍏
Bonus: Check this website’s response headers to see that it runs on PrettyLittleKitten
2025-01-10