Local AI Models vs Cloud Credits: Gemma 4 Under 1GB

Today's art direction

Pricing & Packaging Page / Tiered Plan Comparison

A SaaS pricing page: a dark plan band, three tier columns bridging into ivory, one plan recommended.

Today's page is built like the pricing page every SaaS site eventually ships: a dark plan band up top, three tier columns that overlap the seam into the lighter section below, and one tier marked as the popular pick. The news items are the plans, each with a price chip standing in for its hook. The reusable move is the layout itself, recommended tier set apart by weight and surface rather than a colored side stripe, and a comparison matrix that tells the truth about the trade between columns.

plan bandtier columnprice chiprecommended ribbon comparison matrixfeature rowamber on navyFraunces display

Tooling

Small enough to run where the page is

Quantization-aware training simulates the rounding errors of compression while the model is still learning, so the smaller version loses far less quality than squeezing a finished model would. The practical result Google reports is a Gemma 4 E2B that fits in roughly a gigabyte, with checkpoints packaged for llama.cpp, Ollama, and LM Studio.

For people building interfaces, the line that matters is the one a model crosses when it can live next to the work instead of behind an API. A design or build assistant that runs on the machine changes the latency, the privacy story, and the offline behavior of whatever you ship around it. Coverage of the release puts the edge models well inside laptop and phone budgets.

Technique

Read the plan before you pick it

Three announcements this week describe the same shift from different desks: where the cost of an AI build lands, and on whom. Laid side by side they read like a comparison matrix, which is the honest way to choose a plan.

Where the price of building moves, by plan
	Local	Metered	Accountable
What changed	Model runs on-device in ~1 GB	Agent usage billed separately	Terms name AI-taken actions
Who pays	Your hardware	A monthly credit	You, by signing
In effect	Jun 5	Jun 15	Jun 4
The trade	Smaller model, your RAM	Predictable, but capped	You own the agent's moves

The middle column is the one builders are arguing about, because a flat subscription quietly subsidized a lot of background agents. Once that usage gets its own meter, the cost of letting an agent run unattended becomes a number you can see.

Workflow

Scope the agent like a paid seat

Vercel's new language about actions taken on your account by AI is a quiet signal worth acting on before the bill or the audit does. Treat an agent that touches your project as a real account with real reach, and give it the smallest plan that still works.

Give each agent its own scoped credential, not your personal token, so its actions are attributable and revocable.
Prefer short-lived, single-purpose access over standing keys, the way a signed upload URL expires instead of granting the whole store.
Keep a reviewable trail of what the agent did, and decide which moves it may take unattended versus which wait for a human.

“The cheapest plan is the one where you can still see what you are paying for.”
Field desk, Art Direction Daily

Prompt Lab

Recreate this page

Paste this into your AI design or build tool to reproduce today's visual system.

Design a single self-contained HTML page styled as a SaaS pricing and
packaging page, the kind a product site ships to compare plans.

PAGE ARCHETYPE: a tiered pricing page. Above the fold, a dark slate plan
band holds the top nav, a small masthead line, a large headline, a short
deck, and an abstract plan-stack image on the right. Three pricing-tier
columns sit centered and overlap the seam where the dark band meets the
lighter page below.

COMPONENTS: three tier cards, the middle one "recommended" and set apart
by a darker surface, heavier type, and a small ribbon, never a colored
left stripe. Each card has a small-caps plan label, a tier name, a
monospace price chip, one short sentence, and a text link styled as the
plan's call to action. Below, build the article sections with a
comparison matrix table (real text cells, not check marks), numbered
workflow steps, one pullquote, and a numbered sources list.

PALETTE: ivory paper #FBF8F1, deep slate band #1B2435, amber accent
#C07E1E for prices, ribbon, and marks, slate #5A6377 for secondary text,
hairline #E4DECF. No purple-blue gradient, no neon, no glow.

TYPE: Fraunces for the headline, tier names, and section headings;
Instrument Sans for body prose; Spline Sans Mono for price chips, labels,
and metadata.

GUARDRAILS: body text at least 18px with 1.6 line height and WCAG AA
contrast. Never set long prose on the dark band. Cap card radius around
14px and chips around 7px; vary radius with purpose. Give cards and links
honest hover and focus states with ~160ms easing, and disable transforms
under prefers-reduced-motion. No fake toggles or dead search fields, and
no readable text inside the generated image.

Field Note

The bottom line

Note On the same weekend a model got small enough to run for free on your laptop, the agents in the cloud started charging by the credit. An essay arguing the web's problems are about incentives, not AI is climbing Hacker News at the same time. Pick the plan that keeps the cost, and the control, where you can see them.

Sources

Linked this issue

Gemma 4 QAT models: optimizing compression for mobile and laptop efficiencyblog.google
Google DeepMind releases Gemma 4 QAT checkpoints that cut on-device memorymarktechpost.com
Anthropic ends subscription subsidy for agents June 15; credit pool replaces flat-rate accesstechtimes.com
Updates to legal termsvercel.com
AI didn't break the web. The dotcons didhamishcampbell.com
Hacker News discussion of the Gemma 4 QAT releasenews.ycombinator.com

No. 030 · Saturday, 6 June 2026 · Pricing & Packaging Page / Tiered Plan Comparison · Type: Fraunces, Instrument Sans, Spline Sans Mono · Palette: slate band, ivory paper, amber, slate, hairline · Follow @artdirdaily on X

A field experiment from the team behind Beaver Builder AI.

The free tier is now the model on your laptop

Gemma 4 fits in a gigabyte

Agent usage moves to its own meter

Vercel papers who the agent is