Quel budget pour faire tourner l'IA en local ?

Comptez environ 900 € pour débuter (LLM 8B, génération d'image), ~1 700 € pour un usage confortable (modèles 14-32B), et 3 200 € et plus pour une workstation (70B, vidéo). Une alternative à ~1 900 € existe avec un mini-PC à mémoire unifiée 128 Go pour les très gros modèles.

Combien de VRAM faut-il, et quelle carte graphique ?

La VRAM prime sur la puissance : ~6 Go pour un modèle 8B, ~10 Go pour un 14B, ~20 Go pour un 32B, ~40 Go pour un 70B (quantification Q4). 16 Go suffisent pour un usage confortable ; pour les gros modèles, une RTX 3090 d'occasion (24 Go) reste le meilleur rapport VRAM/prix.

Peut-on faire de l'IA sans carte graphique ?

Oui. Avec un modèle MoE (comme Qwen3-Coder 30B) et beaucoup de RAM, un processeur seul atteint environ 16 à 25 tokens par seconde — plus lent qu'un GPU, mais parfaitement utilisable, et imbattable en capacité pour les très gros modèles.

Vaut-il mieux acheter du matériel ou louer un GPU dans le cloud ?

En dessous d'environ 10 heures de calcul lourd par mois, ou pour un besoin ponctuel, le cloud (voire une API gratuite) est moins cher. Au-delà, l'achat s'amortit.

Peut-on avoir un ChatGPT privé chez soi ?

Oui, gratuitement et hors-ligne : avec Ollama et Open WebUI, on monte en quelques minutes une interface type ChatGPT qui tourne sur sa propre machine, sans envoyer ses données à un tiers.

L'IA locale est-elle au niveau de ChatGPT ou Claude ?

Pour la plupart des usages, l'open source local 2026 est excellent. Atteindre le tout meilleur niveau (Opus, GPT-5) en local demande 10 000 à 20 000 € de matériel. La bonne approche est hybride : du local pour 80 à 90 % des besoins, complété par une API pour les tâches les plus dures, soit environ 95 % de la qualité du frontier à coût maîtrisé.

AI Guidance

Running AI — a local LLM, image generation, cloud compute — comes down first to making the right hardware choices. I put this stack through its paces every day (LLMs on CPU, ComfyUI on RTX, GPUs rented by the hour). Here are honest guides so you don't overpay or pick the wrong gear.

Get guidance See the guides

The guides

Building an AI PC

VRAM above all else. Three budget tiers, from an 8B LLM to a quantized 70B model, with the pitfalls to avoid.

Read the guide →

Renting a GPU in the cloud

When the cloud beats buying, and how to rent an H100 by the hour without blowing your budget. Platform comparison.

Read the guide →

AI tools & subscriptions

Voice, video, transcription: the tools worth their subscription, and the ones you can replace with a local setup.

Read the guide →

Building your own private AI assistant

A ChatGPT of your own, offline and free: model, interface, your documents (RAG) and voice, step by step.

Read the guide →

AI image & video

Generate locally (ComfyUI, Flux, Wan) or via a paid service: cost, quality, VRAM and copyright.

Read the guide →

Why trust me

I only recommend what I've actually run and measured. A few concrete benchmarks from my own use:

A 30B LLM (Qwen3-Coder) running at ~16-25 tokens/s on pure CPU, with no GPU, on a dual-Xeon server.
AI video generation on an RTX 5080 (a 5-second 720p clip in ~10 min via ComfyUI).
A complete self-hosted stack: llama.cpp, ComfyUI, Whisper, speech synthesis — plus cloud GPUs rented by the hour for the heavy jobs.

My conviction, the common thread of these guides: the right approach is rarely "all local" or "all cloud", but hybrid — a well-sized machine for 80-90% of your needs, complemented by APIs (often free) for the rest. The goal is the best result-to-budget ratio, not chasing specs.

Independence. Some links on these pages are affiliate links: if you buy through them, I earn a commission at no extra cost to you. This never changes my recommendations. How I fund this →

Frequently asked questions

What budget do you need to run AI locally?

Expect around €900 to get started (8B LLM, image generation), ~€1,700 for comfortable use (14-32B models), and €3,200 and up for a workstation (70B, video). An alternative at ~€1,900 exists with a mini-PC featuring 128 GB of unified memory for the very large models. See the build guide.

How much VRAM do you need, and which graphics card?

VRAM matters more than raw power: ~6 GB for an 8B model, ~10 GB for a 14B, ~20 GB for a 32B, ~40 GB for a 70B (Q4 quantization). 16 GB is enough for comfortable use; for the larger models, a used RTX 3090 (24 GB) remains the best VRAM-to-price ratio.

Can you do AI without a graphics card?

Yes. With a "MoE" model (such as Qwen3-Coder 30B) and plenty of RAM, a CPU alone reaches ~16-25 tokens/s — slower than a GPU, but perfectly usable, and unbeatable in capacity for the very large models.

Is it better to buy hardware or rent a GPU in the cloud?

Below ~10 hours of heavy compute per month, or for a one-off need, the cloud (or even a free API) is cheaper. Beyond that, buying pays for itself. See the cloud guide.

Can you have a "private ChatGPT" at home?

Yes, for free and offline: with Ollama and Open WebUI, you can set up a ChatGPT-style interface in a few minutes that runs on your own machine, without sending your data to a third party. See the step-by-step tutorial.

Is local AI on par with ChatGPT or Claude?

For most use cases, local open source in 2026 is excellent. Reaching the absolute top tier (Opus, GPT-5) locally requires €10,000 to €20,000 of hardware. The right approach is hybrid: local for 80-90% of your needs, complemented by an API for the hardest tasks — that's ~95% of frontier quality at a controlled cost.

A concrete AI project?

Picking a configuration, setting up a local LLM, a generation pipeline, weighing cloud vs buying: I can guide you from start to finish.

Let's work together

The guides

Building an AI PC

Renting a GPU in the cloud

AI tools & subscriptions

Building your own private AI assistant

AI image & video

Why trust me

Frequently asked questions

A concrete AI project?

Checklist Sécurité Linux