Local AI Guide

Local Models: Use What You Already Have

Most people do not need a brand new AI computer to get started with local models. Start with what you already own, learn how to use it, then upgrade when you actually know what is limiting you.

Local AIOpen Source ModelsConsumer HardwareAgents

If you have been following me for a while, you already know a lot more than what most people are using AI for. If you are just starting, check my 3-step guide to get moving. No need to watch a 2-hour YouTube video or read a 10-page PDF.

Today I want to talk about local models.

There is a lot of pressure online to buy a Mac Mini, Mac Studio, or DGX Spark to run agents. We already discovered that most people do not need to do that. Start with the computer you already own.

But what if you want to run local models so you do not run out of tokens, do not depend on big companies changing prices and model behavior, or simply want to keep your data on your own machine?

There are a lot of open-source models you can download and run locally. Qwen, Gemma, Llama, Nemotron, and many others. But you do need hardware that matches what you are trying to run.

I do not want to get too technical because that is not the point of these posts. I want this to stay accessible, not turn into a deep dive on context windows, KV cache, tokens per second, quantization, or reasoning modes.

Leave that to the people building the models. Let us focus on making our lives and work easier, more automated, and more fun.

1. Run-everywhere models

This is the entry point. Think small models in the roughly 1B to 4B range, things like smaller Qwen and Gemma variants.

Anything with a low B number will usually run on a lot of consumer computers, and in some cases even on phones with tools like Google AI Edge Gallery.

They are not incredibly powerful.
They are not replacing ChatGPT.
But they can answer questions, summarize content, help write, and in some setups even do light tool use or search.
They are a great way to start without spending money.

2. Dedicated GPU, gaming computer, or newer Mac

If you have access to an older gaming computer, a stronger workstation, or a newer Mac, this is the sweet spot for local AI.

Somewhere around 8GB to 24GB of VRAM, or enough unified memory on a Mac, opens up a lot more possibilities. This is where larger compressed models start becoming practical, and where local agents become a lot more interesting.

These are the setups that can make tool calls, move your mouse, write and run code, do long web research, reason through problems, and build real things.

Are they cutting edge? No. They are usually a little behind the newest cloud models. But most people do not need the absolute newest thing to experiment, automate, and get real value.

They run locally. Usage is effectively unlimited. Nobody throttles you. Nobody tells you to come back in 5 hours because you hit your quota.

And you can always mix local and cloud. Let a stronger cloud model handle the hard thinking, then use local models for the heavy lifting when it makes sense.

3. DGX Spark, AMD AI Max, Mac Studio

This is where dedicated AI hardware starts making larger, barely compressed models realistic. Most people do not need one of these yet, although yes, they are awesome.

These systems give you access to larger models, more context, faster responses, better reasoning, and the ability to work with bigger files without the model losing the plot halfway through.

But unless you have already been using agents for a while and know you are being limited by hardware, I would not spend the money here yet.

My advice stays the same

Use what you have. Use free cloud models. Pay for the cheaper subscriptions if you want.

You do not need $200 a month subscriptions. You do not need thousands of dollars in API credits. And you definitely do not need to get scared into buying a Mac Mini because some AI influencer told you that is the only way.

Just start. Tinker. Experiment. Build things. As you get comfortable, upgrade.

Learn to drive first. Then buy a sports car. Then worry about Formula 1.

Not the other way around.

Need help getting started?

If you want help getting set up, send me a message at www.felipepostigo.com.

FAQ

Do I need a Mac Mini to run local models?

No. Many smaller models run on regular consumer hardware, and a decent gaming PC or newer Mac is enough for a lot of useful local AI work.

What is the best local AI setup for most people?

For most people, the sweet spot is a computer with a dedicated GPU, or a newer Mac with enough unified memory. That gives you a lot more flexibility without jumping into expensive specialty hardware.

Should I go fully local and stop using cloud AI?

Not necessarily. A hybrid setup is often the best move. Use cloud models when you want top-end reasoning, and local models when you want privacy, unlimited usage, or lower cost.

When should I upgrade to something like a DGX Spark or Mac Studio?

Upgrade when you already know your current hardware is limiting you, not because someone online made you feel behind. Start with real use cases first, then spend money based on actual constraints.