Hardware Insights
-
Feb. 26, 2026 / Hardware Insights
Qwen3.5 27B and Qwen3.5 35B: What Hardware Do You Actually Need? (GPU Benchmarks Inside)
Qwen3.5 27B fits comfortably on a 24 GB GPU up to 131k context in 4-bit, but becomes memory heavy at 262k. Qwen3.5 35B MoE in 4-bit is the more practical long-context model for 24 GB cards, and it is significantly faster in token generation despite having more total parameters. VRAM is still the main constraint,...
-
Feb. 4, 2026 / Hardware Insights
Qwen3 Coder Next 80B A3B: what it takes to run it locally
Direct answer first: Qwen3 Coder Next 80B A3B is one of the most hardware-friendly 80B-class coding models released so far. Thanks to its MoE design with roughly 3B active parameters, a single high-VRAM GPU can run it at full 256k context, and even dual consumer GPUs can handle the 3-bit version comfortably. VRAM, not raw...
-
Jan. 26, 2026 / Hardware Insights
Best Computers for Running ClawdBot (OpenClaw) AI Assistant Locally
If you are running OpenClaw with a cloud model like Claude Opus, you do not need powerful hardware. Any modern low power system with 8 GB of RAM and a 6th+ gen Intel CPU is enough. If you want to run ClawdBot fully local with reliable tool usage and large context windows, hardware requirements scale...
-
Jan. 22, 2026 / Hardware Insights
We Tested GLM-4.7 Flash 30B MoE — Here’s the GPU You Actually Need
Z.ai released GLM 4.7 Flash only a few days ago, but meaningful local testing had to wait. The initial llama.cpp support was incomplete, and without proper fixes it was not possible to measure real performance. Those fixes have now landed, and with the latest llama.cpp build we were finally able to test the model properly...
-
Jan. 20, 2026 / Hardware Insights
How I Test GPUs for Local LLMs Before I Buy One
Learn how I test GPUs for local LLM inference before buying, using real workflows, llama.cpp, and rented RTX 3090 instances to measure VRAM, context length, and performance.
-
Jan. 19, 2026 / Hardware Insights
Ryzen AI Halo Is Not New Hardware – It’s AMD’s Strix Halo AI Developer Platform
AMD Ryzen AI Halo is being marketed as a new local AI development solution, but it is important to be precise about what it actually is. Ryzen AI Halo does not introduce new silicon, new performance characteristics, or a faster variant of Strix Halo. It is a reference mini PC platform built around the already...
-
Dec. 11, 2025 / Hardware Insights
We Tested Devstral 2 (24B & 123B) — Here’s the Hardware You Actually Need
Mistral AI has just released its new coding model, Devstral 2. We’ve been using its predecessor, Devstral Small, locally for code completion and have been very impressed with its performance. Early reports on Devstral 2 put it on par with other top models like Kimi K2 and Deepseek v3.2, so we were eager to get...
-
Dec. 9, 2025 / Hardware Insights
Best Unified Memory Computers for Local LLMs (2025): Bandwidth, Memory Size, Speed & Price Comparison
Unified memory has become one of the most important features for anyone running local LLMs in 2025. Instead of splitting memory between CPU RAM and GPU VRAM, unified architectures pool it into one high-bandwidth space that both the CPU and GPU can access. This matters because LLM inference is memory-bound long before it becomes compute-bound....
-
Nov. 17, 2025 / Hardware Insights
Best Black Friday 2025 GPU Deals for Local LLM Users
We’re tracking GPUs that make sense for LLM workloads and monitoring their prices now through Black Friday 2025, and we’re grouping them by VRAM since memory capacity determines which models and context lengths they can run, with bandwidth playing a major role in real-world throughput.