Hardware

What your rig can actually do.

Every gaming GPU can make AI videos — but the results vary. Here's an honest breakdown of typical performance across common setups. Times are approximate; run your own benchmarks before committing to a workflow.

Seeded from the Hybrig reference fleet (James's actual rigs). Crowdsourced benchmarks will extend this.

flagship

Bigglesworth-Studios

The flagship rig — runs everything locally, trains your LoRAs overnight.

≈ $3,200 new

GPU: NVIDIA RTX 4090 (24GB)
CPU: AMD Ryzen 7 7800X3D (8C / 4.2 GHz)
RAM: 32 GB DDR5 @ 4800 MT/s
OS: Windows 11

Best for

+Wan 2.1 / 2.2 I2V video generation at 720p–1080p
+Personal LoRA training (4–8 hrs, ~1GB file)
+Flux.dev + Flux Pro image gen at full quality
+Parallel: video + image gen concurrently with spare VRAM

Not good for

−Nothing in Hybrig's current catalog exceeds its capacity — it's the ceiling.

Typical render times

Wan 2.1 I2V, 8s, 720p~3–5 min

Wan 2.1 I2V, 8s, 1080p~6–9 min— edge of VRAM; use fp8 quant if OOM

Flux.dev image, 1024×1024~8–12 sec

Personal LoRA training (Wan)~4–8 hrs— overnight run

Whisper-small, 60s audio~2–3 sec

HyperFrames render, 30s clip~1–2 min

Role in a Hybrig fleet

Primary renderer. Handles heavy Wan + LoRA. If you only have one machine, this is the one.

Worker role local

Local models: wan-2.1-local, flux-pro-local (planned), whisper-local (planned)

workstation

Bigglesworth

The workhorse — handles image gen and lighter video, offloads from the flagship.

≈ $1,400 new

GPU: NVIDIA RTX 3080 (12GB)
CPU: AMD Ryzen 7 3800X (8C / 3.9 GHz)
RAM: 32 GB DDR4 @ 2666 MT/s
OS: Windows 11 Pro

Best for

+Flux / Stable Diffusion image gen
+Wardrobe-lock + inpainting batch runs
+Wan 2.1 video with fp8 quantization (720p, a bit slower)
+HyperFrames headless Chrome rendering

Not good for

−Wan 2.1 at full fp16 precision — OOM likely at 1080p, fp8 required
−LoRA training runs — technically possible at low-rank but impractical (20+ hrs)

Typical render times

Wan 2.1 I2V, 8s, 720p (fp8)~5–8 min— fp8 quant required

Wan 2.1 I2V, 8s, 1080pnot supported— OOM — use flagship or cloud

Flux.dev image, 1024×1024~15–25 sec

Personal LoRA training~20+ hrs— impractical at this tier

Whisper-small, 60s audio~3–5 sec

HyperFrames render, 30s clip~1–2 min

Role in a Hybrig fleet

Image gen + wardrobe batch. Runs in parallel with flagship for 2x throughput on mixed workloads.

Worker role local

Local models: flux-pro-local (planned), wan-2.1-local (fp8 mode)

utility

The Utility Box (rename me)

The utility box — transcription, face embedding, post-production CPU-bound work.

≈ $600 new

GPU: NVIDIA RTX 2070 (8GB)
CPU: older desktop CPU
RAM: 16 GB typical
OS: Windows / Linux

Best for

+Whisper transcription (all sizes up to medium)
+Face embedding / ArcFace lookups
+HyperFrames headless-Chrome rendering
+Image upscaling (Clarity / Real-ESRGAN)

Not good for

−Any video generation — VRAM too low
−Flux dev at full quality — use Flux schnell or quantized variant

Typical render times

Wan 2.1 video generationnot supported— VRAM below threshold

Flux.schnell image, 1024×1024~20–30 sec— schnell variant only

Face embedding (ArcFace)<1 sec

Whisper-medium, 60s audio~5–8 sec

HyperFrames render, 30s clip~2–3 min

Image upscale 2x (Real-ESRGAN)~10–15 sec

Role in a Hybrig fleet

Support box. Runs the parts flagship shouldn't waste cycles on — transcription, embedding, post.

Worker role local

Local models: flux-schnell-local (planned), whisper-local (planned), face-embed-local (planned)

mobile

Mobile Station (MacBook Pro M4 Pro)

The mobile station — author, script, preview, send heavy renders back to the fleet.

≈ $2,800 new

GPU: Apple M4 Pro 20-core GPU (unified memory)
CPU: Apple M4 Pro 14-core CPU
RAM: 24–48 GB unified memory
OS: macOS Sequoia

Best for

+Script generation + variant management (Claude API)
+Flux + Stable Diffusion via MLX (Apple's ML framework)
+Whisper transcription on Apple Neural Engine
+UI authoring — the primary client-side creative console when traveling

Not good for

−Wan 2.1 / 2.2 video generation — no CUDA, Metal video pipelines are immature
−LoRA training for video models — requires CUDA ecosystem

Typical render times

Wan 2.1 I2V via Metal2–4x slower than 4090— possible via ComfyUI-Metal but not production-ready

Flux.dev image via MLX, 1024×1024~10–15 sec— MLX is well-optimized on Apple Silicon

Whisper-large, 60s audio~2–4 sec— Neural Engine accelerated

Personal LoRA trainingnot supported— CUDA-only workflow

HyperFrames render, 30s clip~1–2 min

Hybrig UI + Claude script-gennative speed

Role in a Hybrig fleet

Mobile authoring + preview. Opens Hybrig at hybrig.com, sends heavy work to the fleet back home. Nav badge shows CLOUD ONLY here — by design.

Worker role cloud

Local models: flux-mlx-local (planned), whisper-ane-local (planned)

cloud-only

No GPU / cloud-only

No GPU? Hybrig still works — cloud-only path, pay per render.

no GPU required

GPU: None (or integrated graphics)
CPU: any modern CPU
RAM: 8+ GB
OS: Windows / macOS / Linux

Best for

+Anyone without a dedicated GPU
+Travelers / phone users opening Hybrig on the move
+Users who want zero local setup

Not good for

−The cost savings story. Every draft hits the cloud — pays ~$1.94/take for Seedance Fast, $3-5 for finals.
−Offline work

Typical render times

Seedance Fast, 8s, 720p~3–5 min— $1.94 per take

Seedance Pro, 8s, 1080p~4–8 min— $3–5 per take

Flux via fal, 1024×1024~10–20 sec— ~$0.04 per image

ElevenLabs voice cloningcloud-native— $22/mo subscription

Personal LoRA trainingnot supported— requires local GPU

HyperFrames render, 30s clipcloud TBD— requires server-side headless Chrome

Role in a Hybrig fleet

The zero-setup path. Everything runs on fal / ElevenLabs / Gemini. No local worker needed.

Worker role cloud

Don't see your hardware? Benchmarks are approximate — your setup probably falls between two of these profiles. Once you're running, Hybrig's Dashboard shows actual wall-clock times on your own jobs for real calibration.

Setup local GPU rendering →Back to Dashboard