Self-hosting an AI team sounds heavier than it is. You're not training models or running a GPU cluster — you're running a handful of coordinated agents, a task board, and some glue services in Docker, most of which spend their time waiting on a cloud model to answer. The result is that a modest VPS does the job, and the specs are easy to reason about once you know what actually consumes resources.
Here's the honest breakdown of what to provision, and why.
A VPS (virtual private server) is a rented slice of a physical server with its own CPU, RAM, and disk, running your own operating system. For a self-hosted AI team it's the always-on box that runs the agents, the task board, and the control panel — your data stays on it instead of on someone else's SaaS.
RAM is the number that matters most
Memory is the constraint that bites first. A self-hosted AI office runs several containers at once — the agents, a task board, a control panel, a database, background sync — and that footprint has a floor.
- 8 GB is the hard minimum. Below it, the stack starts and then falls over under load as memory runs out. A 4 GB box will out-of-memory once everything is live.
- 16 GB is the comfortable recommendation. It leaves headroom for spikes, more agents working at once, and the occasional large task.
- Add ~8 GB on top if you plan to run local helper models on the same box (more on that below).
If a provider offers 8 GB and 16 GB tiers at close prices, take the 16 — headroom is cheap insurance against the one busy afternoon that would otherwise take the office down.
CPU: four cores to start, more for parallel work
The agents are mostly I/O-bound when they call a cloud model — they send a request and wait. That means CPU is rarely the bottleneck for a small team.
- 4 vCPU handles a small team on light, sequential workloads.
- 6–8 vCPU is better when several agents run in parallel or the workload is heavy — code tasks, research runs, and background sync all competing for time.
CPU becomes the dominant factor only if you run local models on the box, which shifts the heavy lifting from the provider back onto your own silicon.
Disk: 60 GB minimum, SSD
Storage is the least demanding of the three. Budget:
- ~60 GB minimum for Docker images, the database, logs, and artifacts.
- 80–100 GB for comfortable headroom if the office generates a lot of files.
Prefer SSD over spinning disk — the difference in responsiveness is immediately obvious when containers start and the database is queried.
Do you need a GPU? Usually not
This is the most common misconception. If your agents call a cloud model through your own API key, all the model computation happens on the provider's side — your VPS just orchestrates. No GPU required, and no GPU-priced server.
A GPU only enters the picture if you deliberately choose to run large local models yourself. That's an optional path for teams with privacy or cost reasons to keep inference in-house, and it changes the hardware conversation entirely. For most, bring-your-own-key to a cloud model is simpler and cheaper.
The local-model trade-off
There's a middle ground worth knowing. A small local helper model — for routine overhead like context compression, titles, and page extraction — runs comfortably on CPU with about 8 GB of extra RAM, no GPU needed. It trims paid-token spend without a big hardware jump. Running *large* local models for the agents' primary work is the step that demands serious RAM and ideally a GPU.
You don't have to size this by hand. Pick your team size and workload and get a concrete RAM, vCPU, and disk recommendation in seconds.
Get OfficeForge — $199What it actually costs
Real numbers matter, so here they are. At mainstream providers, an 8 GB server runs roughly $35–52/month and a 16 GB server roughly $69–96/month. Hetzner's shared and ARM lines sit at the low end; managed clouds like DigitalOcean and Vultr sit higher. Add a modest bump if you provision extra RAM for local helpers.
Compared with per-seat AI SaaS — where a handful of subscriptions across a small team easily clears several hundred dollars a month — a single well-sized VPS plus your own model key is usually the cheaper column. (We break the full comparison down in the self-hosted vs SaaS cost calculator.)
Size it in seconds
Rather than guess, the VPS requirements estimator takes your team size, workload, and whether you want local helpers, and returns a concrete RAM / vCPU / disk recommendation with an estimated monthly cost. It's rule-based and needs no signup — a thirty-second sanity check before you rent a box.
Once you know the specs, deploying the office itself is a single command on that server. The setup guide walks the whole thing end to end.
FAQ
How much RAM do I need to self-host an AI team?
Plan for at least 8 GB of RAM as a hard minimum, with 16 GB recommended for comfortable headroom. A 4 GB server will run out of memory once Docker, the task board, and the agents are all running. If you also run local helper models on the box, add roughly 8 GB more.
How many CPU cores does a self-hosted AI office need?
Four vCPU is the baseline for a small team on light workloads. Six to eight vCPU is better for medium-to-heavy use where several agents work in parallel. The agents themselves are mostly I/O-bound when calling a cloud model; CPU matters more if you run local models.
How much does a suitable VPS cost per month?
Realistically, an 8 GB server runs about $35–52/month and a 16 GB server about $69–96/month at mainstream providers like Hetzner, DigitalOcean, and Vultr. Hetzner's shared and ARM lines can be cheaper; managed clouds tend to be pricier.
Do I need a GPU to self-host an AI team?
No, not if your agents call a cloud model through your own API key — all the heavy computation happens on the provider's side. A GPU only matters if you want to run large local models yourself, which is optional.
How much disk space is required?
Around 60 GB is a sensible minimum and 80–100 GB gives comfortable headroom for Docker images, the database, and generated artifacts. SSD storage is strongly preferred over spinning disks for responsiveness.
Does the server need to be a specific operating system?
A clean Linux server — Ubuntu or Debian — with root or sudo access is the standard target. The stack runs in Docker, so the host mainly needs a current kernel and enough resources.
