Realms · Sovereign Infrastructure · AI Cost Repatriation

I cut my AI bill 76% in one month
— without writing a line less of code.

That is what a Realm does. A Realm is sovereign infrastructure you own and run — on the hardware you already have — at carrier-grade reliability, with predictable cost.

I built mine over twelve months. It runs my entire digital life: 14 production applications, my websites, my AI, my data, my CI/CD. The receipt is below.

re2me is not a product catalog. It is an advisory and early-adopter path for people and small organizations ready to test whether a realm makes practical sense. Engagements are by scope. Start by asking reBe first.

What is a Realm? ↓See the receipt →Ask reBe first →
If any of this sounds familiar

Your cloud bill keeps surprising you.

Last quarter it went up again. Nobody on your team can fully explain why. Egress, storage tier transitions, an AI feature that ran hot for a week, an autoscaler that did not scale back down. You stopped opening the dashboard a long time ago.

Your subscriptions stack like sediment.

Forty-seven services. Each one indispensable, individually. Together they cost more than a junior engineer. Each annual renewal arrives with a price increase you cannot meaningfully refuse.

You have hardware you are not using.

Three desktops gathering dust. Two laptops that get replaced before they are tired. A mini-PC you bought for a project that never started. Meanwhile you rent compute by the second from companies that built theirs on hardware just like yours.

The math has flipped quietly. Hardware became cheap. Cloud kept getting expensive. The infrastructure to bridge them — open-source, production-grade, carrier-tested — is now mature enough to run on your hardware without a Kubernetes engineer in the room.

That is what a Realm is. The receipt is below.

What is a Realm

A sovereign cluster.
On your hardware. Running production.

A Realm is the same architecture pattern carriers use to run their networks — K3s clusters, NATS messaging, GitOps deployment, observable, secured, automated — packaged so an SMB, a department, or a single founder can run it on hardware they already own.

It is sovereign because your data never leaves your boundary. It is carrier-grade because the components are battle-tested at telecom scale. And it is yours because every byte of configuration lives in your git repository — readable, reproducible, recoverable.

CORE

Your hardware

K3s cluster on your existing machines. 3 nodes minimum for HA, scales to dozens. NATS lattice for messaging. Longhorn + NFS + MinIO for storage. Cattle, not pets.

EDGE

Cloudflare layer

Tunnel for reachability without inbound ports. Zero Trust for access. Workers for edge compute. KV + R2 for replicated state and object storage. No vendor lock-in — your Realm still runs without it.

EXPANSION

Wherever you need

Realm nodes can extend to a CSP's edge, a cloud provider for burst, another office, a co-lo. Same GitOps pattern. Same observability. The Realm spans where you do.

INTELLIGENCE

AI routing

Routes queries to the cheapest competent model. Local Ollama for bulk. Free tiers (Qwen, Gemini) for what they do well. Frontier APIs (Claude, GPT, etc) only for hard reasoning. Your bill becomes deliberate.

CARRIER-GRADE
High availability with clearer control of cost, locality, and trust.

K3s server nodes run in a quorum. Storage is replicated across nodes (Longhorn) and off-site (MinIO → R2/S3). Cloudflare Tunnel means no inbound ports, no static IP, no firewall complexity — your Realm is reachable from anywhere, but only by people you have explicitly granted Zero Trust access.

When the public internet wobbles, your internal applications keep running. When Cloudflare is healthy, your customers reach you through the Tunnel at edge-cached latencies. The Realm is the centre of gravity. Cloudflare is the bridge — not the boss.

Where are you right now

Three doors.
Pick the one that sounds like you.

Each door maps to a different starting point. Click one to see where it goes.

WHERE YOU ARE

Your CFO sees the cloud bill and frowns. Your platform engineer is burning out on Kubernetes. Half your office hardware sits idle after 6pm. You suspect there is a better way and you cannot put your finger on it.

WHERE TO START

Early-adopter engagements start with one real boundary, one useful workload, and one practical path. Pricing and scope depend on what is actually being proved.

A few branch machines can become a practical first realm.
Ask reBe first →
Six months after we finish

Picture yourself,
after.

Five things that are true today, and what is true once you run a Realm.

BEFORE
AFTER

Last month's cloud bill was higher than the one before. The CFO asked why. I could not answer in detail.

Last month's infrastructure cost: line-item visible, predictable, and approved by the CFO in 90 seconds.

We cannot run AI on customer data — compliance will not allow cloud egress. So our most valuable workflow uses no AI at all.

AI runs inside our Realm. Customer data never leaves the perimeter. Frontier-class reasoning is available only when policy permits it.

Our platform engineer spends 60% of her time fighting Kubernetes, vendor consoles, and surprise renewals.

Our platform engineer ships features. Realm-as-code handles the infrastructure. Everything is one git commit and one approval away.

Every vendor renewal becomes a price-hike negotiation. We have no leverage — switching cost is enormous.

Renewals are an exception, not a season. Hardware is hardware. The few subscriptions we keep, we chose deliberately.

When the cloud goes down, we wait. When a region fails, we explain it to customers.

The Realm runs on hardware in our building. When the public internet fails, our internal apps keep working. When Cloudflare is healthy, customers reach us through the Tunnel.

The receipt

My own Realm.
Twelve months. Real numbers.

Everything below is from my own books. No client numbers, no projections, no rounded-up estimates. The receipt is mine because I had to build the thing on myself first before asking anyone else to trust me with theirs.

My Anthropic bill, month by month.

Late 2025 I started using Claude Code for everything. November hit $5,655. February did it again at $5,505. In March I deployed the routing layer — and switched 70% of the workload to Qwen Code and Gemini CLI (both free at my volume). I did not slow down. I did not write less code. I just stopped paying frontier prices for tasks that did not need frontier reasoning.

$1,900
$5,655
$514
$2,418
$5,505
$565
$1,405
$847
Oct 25
started Claude Code
Nov 25
peak — Sonnet for everything
Dec 25
holidays + Haiku discovery
Jan 26
Feb 26
back at peak — heavy dev
Mar 26
Qwen Code + Gemini CLI live
Apr 26
frontier-only for hard reasoning
May 26*
*month-to-date
PEAK MONTH
$5,655 Nov 2025
POST-ROUTING
~$1,000 / mo Mar-May 2026
SAVINGS, ANNUALISED
~$50,000 / yr

The 14 applications running on my Realm.

Each of these has a managed SaaS equivalent. The right column is what I would be paying today if I had not built the Realm. Sticker prices, from each vendor's public pricing page, for the smallest tier that covers my actual usage.

WHAT I RUN
SAAS EQUIVALENT
$ / MO
NextCloud
Dropbox, Google Drive
$60
Penpot
Figma (team)
$75
Woodpecker CI
GitHub Actions
$80
Zot Registry
GitHub Container Reg
$35
MinIO
AWS S3
$90
Ollama
OpenAI API (basic)
$200
Qdrant
Pinecone (vector DB)
$70
Postgres
Neon, PlanetScale
$50
Grafana + Prometheus + Loki
Datadog
$240
NATS JetStream
Confluent Kafka
$200
CyOS Auth
Auth0, Clerk
$240
cert-manager + Traefik
Cloudflare TLS+LB
$30
ArgoCD
CircleCI Deploys
$90
SaaS equivalent total
14 applications, smallest paid tier
$1460 / mo
My actual cost
Electricity + ISP overage + domain renewals
~$45 / mo

What I still pay for, deliberately: Anthropic for hard reasoning (~$1,000/mo now), mobile ($412/mo, Bell), internet (Rogers business tier), Cloudflare DNS ($14/mo across 8 domains), and electricity. Roughly $1,500/mo for everything — and most of that is mobile and ISP, not infrastructure.

Why now matters

It is no longer survival of the fittest.
It is survival of those who adapt early.

The pace of change in digital infrastructure has crossed a threshold. The next eighteen months will not look like the last five years. AI is no longer a feature — it is the substrate. Sovereignty is no longer a regulatory checkbox — it is competitive advantage. Hardware is no longer the constraint — it is the leverage.

The leaders who recognise this early, who adapt while the path is still open, will keep their margins, their data, and their independence. The ones who wait will spend the next decade negotiating with vendors whose interests have diverged from theirs.

A Realm is not a product I sell you. It is the architecture of staying in business on your own terms — and a community of practitioners learning to operate it together as the terrain keeps moving.

See the substrate →See the intelligence →
Start here

Ask reBe first.
Then decide whether a human call is worth it.

reBe should help structure the first conversation: what you own, what is painful, what data must stay local, which machines are available, who needs access, and what one proof would matter. If there is a fit, the next step can become an advisory call or an early-adopter scope.

Advisory calls

For leaders who need a clear read on whether sovereign infrastructure applies to their situation.

Early adopters

For individuals, tech enthusiasts, creators, and small businesses ready to prove a first realm.

By scope

No public package pricing yet. The useful work depends on boundary, hardware, risk, timeline, and support needs.

Continue in CyOS →