Forget ChatGPT, I Built My Own Local AI with AMD RX 7900 XTX, Ollama & DeepSeek-R1

The journey to bringing you this guide was paved with rage and hardship. Before we go any further, let me be clear: local AI is nowhere near as good as ChatGPT or similar online tools. Without solid prompt engineering, you’ll mostly get weird, useless responses.

That said, DeepSeek-R1 (32B) is hands down the best local model I’ve ever used—but even then, it’s nowhere near the level of ChatGPT-4o in the cloud. To match that, you’d need the DeepSeek-R1 671B model, which is a mind-blowing 404GB. Running that locally? Yeah, that would be absolute madness.

Disclaimer: This post has some strong opinions about Linux distributions and hardware that some people may find disturbing or hurtful. Please don’t take it too serious.

Rant about AMD

Skip it, or read my raw unfiltered anger.

The image of this post perfectly reflects my mood.

A while ago, I decided to build an AI server at home to run models locally. My plan was to get an NVIDIA 4090, which at the time cost around 2000€. But then, my friend—who runs Arch as his daily driver (I should’ve seen the red flag)—was using an AMD RX 7900 XTX, which was only 900€ at the time. He hyped it up, saying, “Oh yeah, get this one! Same VRAM, super easy to set up, everything works flawlessly!”

I was intrigued.

As fate would have it, another friend echoed the same thing, insisting that for 24GB of VRAM, I wouldn’t find anything cheaper. And, well, that was actually true.

However, everything I read online told me that AMD GPUs lag far behind NVIDIA in every way, and worst of all, you’d always have to hack things together just to make them work. Still, on Black Friday, I caved and bought the AMD GPU.

I regret it every single day since putting it in. I hate it. It absolutely sucks.

So far, it has worked on Windows 11—but even there, it was a pain. And seriously, how do you even mess up Windows 11 support??

Then I switched to Ubuntu as my main OS (☹️). After two days of struggle (and reinstalling the entire OS three times), I somehow got it to work. I still don’t know what I did. Every guide on the internet gives different commands, different settings, and different advice. Most are for older AMD GPUs, almost none work for the newer models, and—just for fun—most of the essential tools don’t support the “new” AMD cards either.

I hate it. I hate it so much.

My mood

I will never buy an AMD GPU ever again. Even if they came with 100GB of VRAM and cost just 5€, I do not care.

Looking back, I would rather pay 2000€ for a GPU that just works than spend endless hours hacking together the most basic functionality. The sheer frustration of dealing with this mess infuriates me beyond words.

This post serves as both a rant and a personal reminder: Never. Ever. Ever. Buy. AMD. Hardware. Again.

To be honest, I’m just as disappointed in AMD CPUs. Their hardware transcoding is absolute trash.

From now on, it’s Intel and NVIDIA, forever and always.

Prerequisite

  • 32GB RAM (with ComfyUI, bump that up to 40GB)
  • 250GB SSD Storage
  • Debian 12 LXC

If you are more curious about my exact setup you’ll find a detailed list where you can check if yours is similar here: My Home Server: “PrettyLittleKitten” – A Personal Tech Haven. At the very least, your GPU should match (AMD RX 7900 XTX) to follow the tutorial step by step. If it doesn’t, chances are it’ll fail.

You need to install the kernel drivers on the host for passthrough to an LXC:

apt update
apt install pve-headers dkms
apt install amdgpu-dkms

You need to reboot Proxmox after that.

reboot

Setting up LXC

It’s important to note that the LXC must be privileged. I know there are guides for setting up an unprivileged one, but literally none of them worked—which only fueled my rage to unbearable levels.

So yeah… I just went with privileged.

Easy mode Proxmox VE Helper-Scripts:

bash -c "$(wget -qLO - https://github.com/community-scripts/ProxmoxVE/raw/main/ct/debian.sh)"

Or, just manually add a Debian 12 LXC in the Proxmox GUI. Make sure to enable nesting=1, as this is required for Docker to work later.

I did try to get this working on Ubuntu 24 for hours… and failed miserably. Don’t ask me why—I have no idea.

Now, let’s install some dependencies that we’ll need later—or at the very least, ones that will make our lives easier:

apt update && apt upgrade -y
apt install sudo curl jq -y
cd /tmp
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh ./get-docker.sh 

You are free to install docker without using the convenience script.

GPU Passthrough

This heavily depends on your hardware and software.

If, like me, you have an AMD RX 7900 XTX and Proxmox 8.3.3, then you can just follow along. Otherwise—based on my own painful experience—you’ll likely need to find another guide.

Inside the LXC, run:

cat /etc/group | grep -w 'render\|\video'

This will display the GIDs you need for passthrough in a second—so make sure to note them down:

video:x:44:root # <- /dev/kfd
render:x:993:root # <- /dev/dri/render*

The 44 will be used for “video” and the 993 for “render”. Yours will, of course, be different.

On the Proxmox host run the following command:

lspci | grep VGA
ls -l /sys/class/drm/renderD*/device

The output should be something like this:

lspci | grep VGA
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 31 [Radeon RX 7900 XT/7900 XTX] (rev c8)
11:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Raphael (rev cb)

ls -l /sys/class/drm/renderD*/device
lrwxrwxrwx 1 root root 0 Feb  2 13:33 /sys/class/drm/renderD128/device -> ../../../0000:03:00.0
lrwxrwxrwx 1 root root 0 Feb  2 13:33 /sys/class/drm/renderD129/device -> ../../../0000:11:00.0

In my case, “renderD128” is the part I need. To find yours, match the ID from the first command (e.g., 03:00.0) with the ID from the second command (e.g., 0000:03:00.0). Once they match, you’ll know which renderD* device corresponds to your GPU (the other one is the iGPU of the CPU, don’t use that).

 In the Proxmox GUI, go to your LXC container’s “Resources” tab and click “Add” → “Device Passthrough“. Now, add the “video” and “render” devices using the GIDs you noted earlier:

  • Render device: Use the path for your graphics card and the GID from the LXC output.
  • Video device: Use /dev/kfd and the GID for “video” from the LXC output.

This is what your settings should look like (you may need to restart the LXC first).

After a restart of the container check to see if permission are correct:

ls -lah /dev/kfd /dev/dri/renderD128 # <- edit the ID

The output should look like this:

crw-rw---- 1 root render 226, 128 Feb  2 18:01 /dev/dri/renderD128
crw-rw---- 1 root video  236,   0 Feb  2 18:01 /dev/kfd

Make sure that “root render” is the GPU and “root video” the Kernel Fusion Driver (kfd).

Kernel Fusion Driver

If you want to run ROCm-based GPU compute workloads, such as machine learning, OpenCL, or scientific computing, on your AMD GPU within Proxmox. It acts as the interface between the AMD GPU driver and user-space applications, enabling GPU acceleration for parallel computing tasks.

– ChatGPT-4o

Install AMD Software

We need to install some tools inside of our Debian LXC:

apt update && apt upgrade -y
cd /tmp
wget https://repo.radeon.com/amdgpu-install/6.2.4/ubuntu/noble/amdgpu-install_6.2.60204-1_all.deb
sudo apt install ./amdgpu-install_6.2.60204-1_all.deb
amdgpu-install --usecase=rocm --no-dkms

You can also refer to the official guide: Quick Start Installation Guide – ROCm (at the very least, check if the links are still valid by the time you’re reading this).

The download might take a while. Since I have a new AMD RX 7900 XTX, I need to use:

export HSA_OVERRIDE_GFX_VERSION=11.0.0
echo 'export HSA_OVERRIDE_GFX_VERSION=11.0.0' >> ~/.bashrc
source ~/.bashrc

The next step is to test if everything is working. We’ll use amdgpu_top for this (you can skip this, but I wouldn’t):

sudo apt install libdrm-dev
cd /tmp
wget https://github.com/Umio-Yasuno/amdgpu_top/releases/download/v0.10.1/amdgpu-top_without_gui_0.10.1-1_amd64.deb
dpkg -i amdgpu-top_without_gui_0.10.1-1_amd64.deb 
amdgpu_top

You should see the name of your GPU and see some values below.

Run this command inside of the LXC to see if everything worked correctly:

/opt/rocm/bin/rocminfo

Somewhere in that “rocminfo” output you should see your GPU (and a bunch of other nerd stuff):

  Marketing Name:          Radeon RX 7900 XTX
  Vendor Name:             AMD

🤩 Sweet!

Installing Ollama

This will also take a while. Just follow the guide on the Ollama website —here’s all you need to do:

apt install curl 
cd /tmp
curl -fsSL https://ollama.com/install.sh | sh

Since curl is not installed by default in the Debian LXC we’re using, we’ll need to install it first (if you filled this guide, you have it already). Then, we’ll run the install script from the Ollama website. Be patient—the download takes a while since it pulls about 30GB of data.

By the way, I love the Ollama website. The simple black-and-white design with rounded borders? 🤌 I’m a sucker for minimalistic aesthetics. (I hope you like my blog’s design too! 🕺)

Next Step: Testing with a Smaller Model

Before downloading the DeepSeek 32B model, we’ll first test with a smaller DeepSeek version. If your GPU matches mine, the larger model should work fine.

You can check the available models on the Ollama Models website.

Now, run this command—it’s going to download a 1.5GB file, so the wait time depends on your internet speed:

ollama pull deepseek-r1:1.5b

You can then test:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "deepseek-r1:1.5b",
  "prompt": "Tell me a funny story about my best friend Karl. 300 characters maximum.",
  "stream": false
}' | jq .

Once upon a time, in the quiet town of Pawsley, there was Karl, a beloved kind cat who adored his three feline friends: Sam, Max, and Lily. Karl always had a warm smile and a habit of aiding others, often finding humor in unexpected places.

One crisp autumn afternoon, Karl spotted a bag of marbles on the park’s bench. Curious, he decided to play with them. To everyone’s surprise, a man walked by, holding a jar full of coins instead of marbles. “Hey, it’s you!” Karl exclaimed. The man, initially unimpressed, asked for his change. Karl suggested taking one marble in exchange and gave him the coins.

“Thank you,” the man thanked. Karl offered him a glass of water, knowing the jar was empty. “That’ll keep me happy,” he said. But as the water filled the glass, a strange ripple appeared on the floor. The man ran over to check but found nothing—no coins, just a tiny, pink spot.

“Wait, is it my pet?!” Karl pointed. The man laughed, looking suspiciously at Karl. “You saw that marble in there.” He handed Karl a piece of paper and said,”I need your help.”

Karl rolled on his back and rolled into the street. From behind him, he saw Sam, Max, and Lily with their usual spotlights on. Karl placed a paw on Sam’s paw, suggesting they might be related.

“Hey!” Max excitedly exclaimed. “My little sister is getting sick! Can you help me find her?”

Karl hesitated but followed Max to the neighbor’s yard. His friend found Lily, who was hiding somewhere. Karl helped Lily bring her back home, while Sam and Max watched in laughter. Karl’s kind words were met with cheers.

In the end, Karl became a beloved companion among Pawsley’s residents, not just for his love of marbles but also for his warmth and kindness.

Never interrupt an AI artist in the middle of their masterpiece. 🎨🤖 But yeah, the 300-Character-Limit was ignored.

Cleanup

Let’s get rid of the old files. But first, double-check before running these commands—they’ll wipe out all .sh and .deb files in /tmp.

ls -lh /tmp/*.sh /tmp/*.deb

The command above shows you what the command below will delete. Proceed if the displayed files are trash. The deleted files are irreversibly gone after this command:

rm -f /tmp/*.sh /tmp/*.deb
apt autoremove

Testing OpenWebUI with DeepSeek-R1 32B

The prompt I gave DeepSeek was “Give me a 10 word super short story about a cat named Paul.” – which is the name of my actual cat.

Well, not quite 10 words, but close enough. This was a cold start, meaning the model wasn’t loaded into VRAM yet—so yeah, it took a bit longer.

DeepSeek really put in some serious effort here, like it was deep in thought, sweating over every word. 😂 Considering all that, the result is absolutely hilarious:

Okay, so I need to come up with a 10-word super short story about a cat named Paul. Hmm, let’s see… The user gave an example: “Paul the cat climbed the fence, chasing a bird, then napped in the sun.” That’s pretty straightforward but conveys a little action and a bit of character.

I should think about what elements make a good 10-word story. It needs to have a clear subject (Paul), some action or activity, maybe an obstacle or goal, and perhaps a resolution or consequence. Since it’s so short, each word counts, so I need to be concise but still vivid.

Let me brainstorm some ideas about Paul the cat. Maybe he’s curious, playful, adventurous, or lazy. The example has him chasing a bird and then napping. Perhaps another activity? Maybe exploring somewhere, finding something, or having an interaction with another animal or person.

I should think of a simple structure: subject (Paul), action verb, some context, maybe an obstacle or additional detail, and a resolution. For example, “Paul pounced on the mouse but it escaped.” That’s 8 words. Maybe add more details to make it 10.

Or perhaps something like “Paul sat by the window watching birds chirp.” That’s also 8 words. Maybe include movement: “Paul raced through the garden chasing a butterfly.”

Wait, the user provided an example that starts with Paul doing an action (climbed), then another action (chasing), and ends with resting (napped). So perhaps following a similar structure would work well.

Let me try: “Paul explored the attic, found a toy, and took a nap.” That’s 9 words. Maybe add more detail or action. Or “Paul climbed the tree, met another cat, then curled up to sleep.”

Wait, that’s 10 words. Hmm, but it feels a bit flat. Maybe make the interaction more engaging: “Paul chased a squirrel around the yard and finally rested.” That’s 9 words. Could add something like “in the shade” at the end for 10.

Alternatively, maybe include some emotion or consequence: “Paul pounced on the ball, rolled over, and fell asleep.”

Wait, let me check word count. Each idea needs to be exactly 10 words. Let’s see:

  • Paul the cat climbed the fence, chasing a bird, then napped in the sun. (10 words)

So that’s the example given. I can try something similar but different.

Maybe focus on a different activity: “Paul discovered a sunny spot, curled up, and purred softly.” That’s 9 words. Maybe add more action or detail to make it 10.

Alternatively: “Paul leaped onto the counter, knocked over a bowl, then ate the leftovers.” That’s 10 words but maybe a bit long in description.

Another angle: “Paul wandered through the park, met a dog, and they played together.” That’s 9 words. Add something like “all day” at the end for 10.

Wait, perhaps include some obstacle or challenge: “Paul tried to catch a fish in the pond but got wet.” That’s 10 words. It includes an attempt, action, consequence.

Or maybe something more peaceful: “Paul stretched lazily on the rug, yawned, and drifted off.” 9 words. Add “to sleep” for 10.

I think I need to ensure each word adds value without being redundant. Let me try another approach:

  • Paul chased a butterfly in the garden until it flew away. (That’s 8 words. Maybe add “then rested” at the end: 10 words.)

Wait, but that might be too similar to the example.

Alternatively, focus on something unique about Paul: “Paul, the curious cat, explored every nook and cranny.” That’s 9 words. Maybe add “all day long” for 12; that’s too many.

Perhaps include an interaction with another animal or a person: “Paul meowed at the door, demanding attention, then curled up.” 9 words. Add “by the fire” for 10.

Wait, perhaps combine actions and consequences more effectively. Let me think of a scenario where Paul does something that leads to a natural conclusion in just 10 words.

How about: “Paul sneaked into the kitchen, stole a treat, then hid under the bed.” That’s 9 words. Maybe add “to digest” for 10.

Or maybe something more adventurous: “Paul climbed a tall tree, got stuck, and miaowed for help.” That’s 9 words. Add “from his owner” for 11; that’s too many.

Alternatively: “Paul caught a moth in the dark, played with it, then slept soundly.” 9 words. Maybe add “through the night” for 10.

I think I’m overcomplicating it. Let me go back to the example and see if I can structure something similar but different. The example has two actions (climbed, chasing) followed by a rest. So maybe another set of actions leading to a consequence or resolution.

Perhaps: “Paul chased his tail in circles until he panted.” That’s 8 words. Maybe add “then stopped to rest” for 10.

Alternatively: “Paul jumped on the sofa, knocked over a lamp, then slept.” But that’s 9 words; maybe add “peacefully” at the end for 10.

Wait, perhaps focus on Paul’s interaction with an object or another animal. Let me think of something unique but still within the 10-word limit.

Another idea: “Paul found a cozy spot under the stairs and napped.” That’s 8 words. Maybe add “after exploring all day” for 10.

Wait, that would be “Paul found a cozy spot under the stairs after exploring all day.” That’s 9 words. Hmm.

Alternatively: “Paul purred contentedly while basking in sunlight.” 7 words. Add some activity before that to make it 10.

I think I need to stick to a structure where Paul does something, then another action or consequence, and maybe ends with a state or rest.

Let me try again: “Paul climbed the fence, chased a butterfly, then napped in shade.” That’s 9 words. Maybe change “then” to “and” for flow: “climbed, chased, and napped” but that might not add up correctly.

Wait, let’s count the example given: “Paul the cat climbed the fence, chasing a bird, then napped in the sun.” That’s 10 words. So it includes his name, his action (climbed), context (the fence), another action (chasing), and the consequence (napped in the sun). I need to mirror that structure but with different actions.

So maybe: “Paul the cat explored the garden, found a mouse, then relaxed by the pond.” That’s 10 words. It follows a similar flow—action, context, additional action, consequence.

Alternatively: “Paul the cat pounced on a toy, rolled around, and fell asleep.” 9 words. Add “on the rug” for 10.

Wait, maybe make it more dynamic: “Paul the cat raced through the park, met a friend, then rested under a tree.” That’s 10 words. It shows activity, interaction, and rest.

I think that’s a good structure—action, context, another action or interaction, consequence. So putting it all together, I can create a concise story within the word limit.

– DeepSeek-R1:32B thoughts

Here you can see the electricity usage—a solid 450W for two minutes until things calmed down. Usually the server “idles” at around 50-53W, since I have a local SIEM and this website (along with a lot of other stuff) runs on it, it is never really really idle.

Both hardware and software went all in, pushing their limits… all for my short, 10-word cat story:

This does not prove that DeepSeek-R1 is a waste of electricity and ends up with a subpar result, rather that you probably do not need an advanced thinking model for such a super simple task.

Check out dolphin-mistral, also on a cold start only taking 4 seconds. The second message did not even take a second:

Bonus: ComfyUI

The Official Installation Guide: How to Install ComfyUI on Linux Alright, OpenWebUI now supports ComfyUI for image generation (still in beta at the time of writing). Here is the super quick start for ComfyUI, run these commands in your LXC:

apt update && apt upgrade -y && apt install git apt install python3.11-venv -y
mkdir ~/comfy && cd ~/comfy
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ~/comfy/ComfyUI
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
pip install -r requirements.txt

Make sure to double-check the versions and links against your system and the official guide I linked at the top.

Once everything is set up, give it a test run to confirm it’s working as expected:

python main.py

If your output confirms that everything is running smoothly, go ahead and turn it into a system service:

nano /etc/systemd/system/comfyui.service

Paste the following into the service file:

[Unit]
Description=ComfyUI Service
After=network.target

[Service]
User=root
Group=root
WorkingDirectory=/root/comfy/ComfyUI
ExecStart=/root/comfy/ComfyUI/venv/bin/python /root/comfy/ComfyUI/main.py --listen 0.0.0.0
Restart=always

[Install]
WantedBy=multi-user.target

Now reload and start the comfyui.service:

sudo systemctl daemon-reload
sudo systemctl enable comfyui.service
sudo systemctl start comfyui.service
sudo systemctl status comfyui.service

ComfyUI Manager

ComfyUI Manager is the ultimate quality-of-life add-on for ComfyUI, making model and plugin installation a breeze. Just get it—you’ll thank me later:

cd ~/comfy/ComfyUI/custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager comfyui-manager
sudo systemctl restart comfyui.service

Restart takes a little while since it downloads some dependencies, but no worries—give it a minute or two, and everything should run smoothly.

Conclusion

You should now have a working Debian LXC for local AI tasks, equipped with:

Setting this up, especially with newer AMD GPUs, used to be a bit of a headache. I’m honestly glad I waited a bit—getting this to work wasn’t exactly painless, and I had to do a lot of digging through forums. But hey, it’s running now!

Next up, get some models from the Ollama page. If you have an AMD RX 7900 XTX, you should be able to run the 32B version of DeepSeek-R1 effortlessly. Technically, you can load models larger than your VRAM, but be warned—it’ll slow things down.

Also, don’t forget to secure your AI server and add valid SSL certificates, check out my post about it:

I set up firewall rules centrally on my Dream Machine, so my AI server can only communicate with the reverse proxy.

If your setup is different, you might want to handle this locally using UFW.

These Goodbye Message are Brought to you by AI

Aww, yaaaay! I totally loooove all your amazing readers <3 Wishing you guys the biiiggest luck with everything you dooove, okay? Dayyyyyybeeee~ 💕✨

– qwen2.5:32b

Love y’all, keep slaying in everything you do <3 Can’t wait to see what awesome things you have coming up. Bye for now! 👋🏼😉

– dolphin-mistral
Goodbye image I made with Flux Schnell and ComfyUI

EDIT 1 (04.02.2024)

I have downloaded and tested almost all popular models now and the only actually usable one for daily business like rewriting German emails or asking for expertise in German is qwen2.5 so far.

The uncensored Dolphin models are a lot of fun, but also kind of stink with German, which is of course because their underlying models aren’t good at German either.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *