- Set HIP_VISIBLE_DEVICES=0 to use only the discrete GPU (gfx1201).
llama.cpp was trying to split layers across the iGPU (gfx1036) which
caused segfaults when loading the multimodal projector.
- Restore --mmproj for both HF models (multimodal works correctly with
single GPU).
- Keep qwen3.5:9b disabled (Ollama-extracted GGUF uses old mrope_sections
key format incompatible with this llama.cpp build).