blog

Local-First AI

Deep dives and field notes on local-first AI, agentic architecture, and what is actually working in 2026, with primary sources and reproducible benchmarks.

all posts Local models Agent architecture Production Founders & SMB Identity & compliance Culture

Showing 11 of 38 posts in Local models · clear

Deep dives

Long-form research articles with primary sources, benchmarks, and reference tables.

Cost-Quality Pareto for Coding Agents (May 2026)

Paying $5 per task no longer gets you the frontier. Qwen3.6-27B running locally hits 77.2% SWE-bench Verified at ~$0.04/task, within 10pp of Opus 4.7 for ~130x less money.

Deep divecoding-agentsbenchmarkscostswe-benchclaude-codeqwendeepseek

Mixing and Matching Open-Weight Models: A Recipe Book

An 84% cost reduction on a real SaaS workload, a 97% reduction on agentic dev loops, and the three-tier mix that actually ships in May 2026.

Deep divelocal-modelscostopen-weightsagentsqwenkimideepseek

Field notes

Three Ways to Run Open Weights for Pennies

Local hardware vs rented GPUs vs serverless OSS APIs. Real prices, real benchmarks, and the workload-shape question that decides which path is right.

local-modelsopen-weightscloudbenchmarkscost

MCP, Honestly: What It Is and What It Is Not

What the Model Context Protocol actually is, what it gets right, where it leaks, and why the local-first version is the cleaner story.

mcpagentsprotocolslocal-firstollama

How Agents Burn Through Runway, and How to Stop Them

The engineering math behind preventing an agentic loop from burning through your monthly runway in one night.

agentseconomicslocal-inferenceops

I Run a Whole AI Stack on a Laptop for Three Cents an Hour

28.4 tokens per second on a laptop running GLM-4 9B, three cents of electricity per session, and the moment local inference stopped being a hobby.

benchmarksamdrocmlocal-inference

Building a Personal AI Agent in a Weekend

The build, the OpenClaw config, and the first agent worth running. End to end on a Framework 16 with 96GB unified memory.

amdrocmopenclawagentslocal-first

Find Your Agent-Ready Tasks in 90 Minutes

A framework for finding which 20% of your tasks are agent-ready before you write a line of code.

smbagentsauditautomationframework

The Zero-Inbox Agent

Triage that does not just summarize. It prepares the drafts and fetches the data, and you approve. The 60-line config that actually works.

agentsn8nollamainboxautomation

When to Run Locally and When to Pay Anthropic

Real numbers, real workloads, real break-even points. When local is the obvious answer, when cloud is, and the hybrid that wins for most teams.

economicslocal-inferencecloudcost-benefit

Why I Bet on a Framework Laptop in 2026

The repairable, AMD-powered laptop that runs my entire AI stack at three cents per session. The hardware case for Framework 16 in 2026.

hardwareframeworkamdstrix-pointstrix-halo