Hardware guide

Which GPU + how much VRAM + which execution provider you actually need. No marketing fluff.

The four providers, ranked

Tier	Provider	What it looks like
1	NVIDIA CUDA + TensorRT	RTX 2060 and up, ≥6 GB VRAM. Best latency, widest model support, TensorRT FP16 paths on Ampere+.
2	Intel OpenVINO	Arc A-series discrete, Iris Xe iGPU, N100/N305 mini-PCs. Great power/perf for home-lab, needs `/dev/dri` in container.
3	AMD ROCm	RDNA2+ (6600/6700/7000 series) on Linux hosts with `/dev/kfd`. Works, but fewer models have official ROCm kernels.
4	CPU (ONNX)	AVX2-capable modern x86. Works for FSRCNN/ESPCN real-time, anything ESRGAN+ will be minutes-per-frame.

VRAM budget by model

These numbers are measured on CUDA at FP16 with a single 1080p→4K pass. Double them for 4K→8K.

Model family	Min VRAM (FP16)	Comfort VRAM
FSRCNN / ESPCN	500 MB	1 GB
Waifu2x	1.5 GB	3 GB
Real-ESRGAN / RealESRGAN-Anime	3 GB	6 GB
SwinIR-M / HAT-S	4 GB	8 GB
SwinIR-L / HAT-L / AnimeSR-v2	7 GB	12 GB
EDVR-M / RealBasicVSR (multi-frame)	8 GB	16 GB
GFPGAN + Upscaler (face restore pipeline)	+1 GB on top	+2 GB on top

Headroom matters: if your GPU also serves Jellyfin transcoding (NVENC/QSV), keep ~2 GB free for the transcode session or you will see OOM kills on simultaneous playback.

Recommended setups

Budget home lab (Intel N100 / Arc A380)

Provider: OpenVINO (:latest-openvino image)
Good for: FSRCNN real-time, Waifu2x 2×, Real-ESRGAN-anime-x4 batch
Config: Max Concurrent Streams = 1, cache enabled
Expect: ~15 fps at 720p → 1440p with anime-compact-x4

Mid-range (RTX 3060 12GB / RTX 4060)

Provider: CUDA (:latest-cuda image)
Good for: Everything in the catalogue except EDVR-M and HAT-L comfortably
Config: Max Concurrent Streams = 1, Auto-Mode on
Expect: 4×-realesrgan at ~20 fps on 720p source, 3× on 1080p source

High-end (RTX 4090 / A100)

Provider: CUDA + TensorRT conversion
Good for: SwinIR-L, HAT-L, EDVR-M temporal, GFPGAN pipeline
Config: Max Concurrent Streams = 2, pre-processing cache on NVMe
Expect: Near real-time 4× on any model

CPU-only NAS (e.g. Synology DS920+)

Provider: CPU (:latest-cpu image)
Good for: Overnight batch upscales of an anime library with fsrcnn-x2/espcn-x4
Config: Scan & Upscale Library scheduled at 3 AM, pre-processing cache on the biggest pool you have
Expect: Hours per feature-length file. This is batch, not live.

Jellyfin-side hardware acceleration

Orthogonal to the AI upscaler — Jellyfin still handles decoding the source file and encoding the final MP4 itself.

Decoders: NVDEC, QSV, VAAPI all work unchanged.
Encoders: the plugin exposes 12 codec options with tuned defaults (see Config → Output Codec).
Tonemap: HDR→SDR via tonemap_cuda / tonemap_opencl handled by Jellyfin's stream path, runs before the upscale filter.

Remote transcoding (optional)

Set Enable Remote Transcoding and the plugin installs a wrapper script that redirects Jellyfin's FFmpeg invocations to the AI service host over SSH. Use this when your Jellyfin server has no GPU but a separate desktop on the LAN does.

POST /Upscaler/wrapper/install
# generates on-disk next to the plugin DLL:
#   Linux : upscale-wrapper.sh
#   Win   : upscale-wrapper.bat + upscale-logic.ps1

Prereq: password-less SSH from Jellyfin host to the GPU host, with ffmpeg on the $PATH at the remote end.