Hardware

What your rig can actually do.

Every gaming GPU can make AI videos — but the results vary. Here's an honest breakdown of typical performance across common setups. Times are approximate; run your own benchmarks before committing to a workflow.

Seeded from the Hybrig reference fleet (James's actual rigs). Crowdsourced benchmarks will extend this.

flagship

Bigglesworth-Studios

The flagship rig — runs everything locally, trains your LoRAs overnight.

≈ $3,200 new

GPU
NVIDIA RTX 4090 (24GB)
CPU
AMD Ryzen 7 7800X3D (8C / 4.2 GHz)
RAM
32 GB DDR5 @ 4800 MT/s
OS
Windows 11

Best for

  • +Wan 2.1 / 2.2 I2V video generation at 720p–1080p
  • +Personal LoRA training (4–8 hrs, ~1GB file)
  • +Flux.dev + Flux Pro image gen at full quality
  • +Parallel: video + image gen concurrently with spare VRAM

Not good for

  • Nothing in Hybrig's current catalog exceeds its capacity — it's the ceiling.

Typical render times

Wan 2.1 I2V, 8s, 720p~3–5 min
Wan 2.1 I2V, 8s, 1080p~6–9 minedge of VRAM; use fp8 quant if OOM
Flux.dev image, 1024×1024~8–12 sec
Personal LoRA training (Wan)~4–8 hrsovernight run
Whisper-small, 60s audio~2–3 sec
HyperFrames render, 30s clip~1–2 min

Role in a Hybrig fleet

Primary renderer. Handles heavy Wan + LoRA. If you only have one machine, this is the one.

Worker role local

Local models: wan-2.1-local, flux-pro-local (planned), whisper-local (planned)

workstation

Bigglesworth

The workhorse — handles image gen and lighter video, offloads from the flagship.

≈ $1,400 new

GPU
NVIDIA RTX 3080 (12GB)
CPU
AMD Ryzen 7 3800X (8C / 3.9 GHz)
RAM
32 GB DDR4 @ 2666 MT/s
OS
Windows 11 Pro

Best for

  • +Flux / Stable Diffusion image gen
  • +Wardrobe-lock + inpainting batch runs
  • +Wan 2.1 video with fp8 quantization (720p, a bit slower)
  • +HyperFrames headless Chrome rendering

Not good for

  • Wan 2.1 at full fp16 precision — OOM likely at 1080p, fp8 required
  • LoRA training runs — technically possible at low-rank but impractical (20+ hrs)

Typical render times

Wan 2.1 I2V, 8s, 720p (fp8)~5–8 minfp8 quant required
Wan 2.1 I2V, 8s, 1080pnot supportedOOM — use flagship or cloud
Flux.dev image, 1024×1024~15–25 sec
Personal LoRA training~20+ hrsimpractical at this tier
Whisper-small, 60s audio~3–5 sec
HyperFrames render, 30s clip~1–2 min

Role in a Hybrig fleet

Image gen + wardrobe batch. Runs in parallel with flagship for 2x throughput on mixed workloads.

Worker role local

Local models: flux-pro-local (planned), wan-2.1-local (fp8 mode)

utility

The Utility Box (rename me)

The utility box — transcription, face embedding, post-production CPU-bound work.

≈ $600 new

GPU
NVIDIA RTX 2070 (8GB)
CPU
older desktop CPU
RAM
16 GB typical
OS
Windows / Linux

Best for

  • +Whisper transcription (all sizes up to medium)
  • +Face embedding / ArcFace lookups
  • +HyperFrames headless-Chrome rendering
  • +Image upscaling (Clarity / Real-ESRGAN)

Not good for

  • Any video generation — VRAM too low
  • Flux dev at full quality — use Flux schnell or quantized variant

Typical render times

Wan 2.1 video generationnot supportedVRAM below threshold
Flux.schnell image, 1024×1024~20–30 secschnell variant only
Face embedding (ArcFace)<1 sec
Whisper-medium, 60s audio~5–8 sec
HyperFrames render, 30s clip~2–3 min
Image upscale 2x (Real-ESRGAN)~10–15 sec

Role in a Hybrig fleet

Support box. Runs the parts flagship shouldn't waste cycles on — transcription, embedding, post.

Worker role local

Local models: flux-schnell-local (planned), whisper-local (planned), face-embed-local (planned)

mobile

Mobile Station (MacBook Pro M4 Pro)

The mobile station — author, script, preview, send heavy renders back to the fleet.

≈ $2,800 new

GPU
Apple M4 Pro 20-core GPU (unified memory)
CPU
Apple M4 Pro 14-core CPU
RAM
24–48 GB unified memory
OS
macOS Sequoia

Best for

  • +Script generation + variant management (Claude API)
  • +Flux + Stable Diffusion via MLX (Apple's ML framework)
  • +Whisper transcription on Apple Neural Engine
  • +UI authoring — the primary client-side creative console when traveling

Not good for

  • Wan 2.1 / 2.2 video generation — no CUDA, Metal video pipelines are immature
  • LoRA training for video models — requires CUDA ecosystem

Typical render times

Wan 2.1 I2V via Metal2–4x slower than 4090possible via ComfyUI-Metal but not production-ready
Flux.dev image via MLX, 1024×1024~10–15 secMLX is well-optimized on Apple Silicon
Whisper-large, 60s audio~2–4 secNeural Engine accelerated
Personal LoRA trainingnot supportedCUDA-only workflow
HyperFrames render, 30s clip~1–2 min
Hybrig UI + Claude script-gennative speed

Role in a Hybrig fleet

Mobile authoring + preview. Opens Hybrig at hybrig.vercel.app, sends heavy work to the fleet back home. Nav badge shows CLOUD ONLY here — by design.

Worker role cloud

Local models: flux-mlx-local (planned), whisper-ane-local (planned)

cloud-only

No GPU / cloud-only

No GPU? Hybrig still works — cloud-only path, pay per render.

no GPU required

GPU
None (or integrated graphics)
CPU
any modern CPU
RAM
8+ GB
OS
Windows / macOS / Linux

Best for

  • +Anyone without a dedicated GPU
  • +Travelers / phone users opening Hybrig on the move
  • +Users who want zero local setup

Not good for

  • The cost savings story. Every draft hits the cloud — pays ~$1.94/take for Seedance Fast, $3-5 for finals.
  • Offline work

Typical render times

Seedance Fast, 8s, 720p~3–5 min$1.94 per take
Seedance Pro, 8s, 1080p~4–8 min$3–5 per take
Flux via fal, 1024×1024~10–20 sec~$0.04 per image
ElevenLabs voice cloningcloud-native$22/mo subscription
Personal LoRA trainingnot supportedrequires local GPU
HyperFrames render, 30s clipcloud TBDrequires server-side headless Chrome

Role in a Hybrig fleet

The zero-setup path. Everything runs on fal / ElevenLabs / Gemini. No local worker needed.

Worker role cloud

Don't see your hardware? Benchmarks are approximate — your setup probably falls between two of these profiles. Once you're running, Hybrig's Dashboard shows actual wall-clock times on your own jobs for real calibration.