RTX 4090

  • Apr. 3, 2026 / Featured

    What Hardware for Gemma 4 26B and 31B LLM Local Use

    The new Gemma 4 models from Google DeepMind have landed, and for local LLM users this is one of the more practical releases in a while. The lineup gives us two interesting mid-size targets: a 26B MoE model (A4B) and a 31B dense model. Both support up to 256K context, tool calling, and personal agent-style...

    main image of gemma 4 hardware and gpu
  • Nov. 3, 2025 / Hardware Insights

    Inside PewDiePie’s $41,000 AI PC: 424GB of VRAM for Local LLMs

    When one of YouTube’s biggest creators decides to build a personal AI supercomputer, the local LLM scene takes notice. PewDiePie’s journey into AI hardware has produced a multi-GPU, 424GB VRAM workstation that many enthusiasts dream of. While his budget is far beyond the average builder, his component choices and setup offer a valuable blueprint for...

    PewDiePie’s custom open-frame AI PC build showing 10 GPUs installed on the left and NVIDIA System Management Interface on the right listing eight RTX 4090 48GB cards and two RTX 4000 Ada 20GB cards, totaling 424GB of VRAM.
  • Oct. 17, 2025 / LLM Benchmarks

    RTX 4090 LLM Benchmarks: Performance Across 4K – 131K Context Sizes

    I tested the RTX 4090 with five quantized models to measure real-world inference performance for local LLM workloads. This is the second article in my GPU benchmark series, following my recent RTX 5090 tests. I ran these benchmarks to provide concrete performance data across different model sizes and context lengths using llama.cpp. Testing Environment My...

    NVIDIA GeForce RTX 4090 graphics card with performance benchmark graph background, illustrating powerful GPU performance for local LLM and AI model inference.