Replace the Ollama service with a custom ROCm image combining ghcr.io/ggml-org/llama.cpp:server-rocm and llama-swap v199. Main motivations: - Unblock qwen35 HF GGUFs (qwen35 architecture not supported in Ollama 0.20.4 for HF-imported models) - Stay current with llama.cpp upstream without waiting for Ollama releases Changes: - ollama/Dockerfile: build llama-swap on top of llama.cpp:server-rocm - ollama/llama-swap.yaml: define 4 models with full sampler config, GPU offload, and mmproj for the two multimodal HF fine-tunes - ollama/docker-compose.yml: replace Ollama image with local build; fix broken volume mount (was /ubuntu/.ollama, now explicit /models) - ollama/Caddyfile: update upstream port 11434→8080 (llama-swap default) - ai/docker-compose.yml: switch Open WebUI from OLLAMA_BASE_URL to OPENAI_API_BASE_URL pointing at llama-swap /v1 endpoint
29 lines
675 B
Caddyfile
29 lines
675 B
Caddyfile
{
|
|
email {env.LETSENCRYPT_EMAIL}
|
|
}
|
|
|
|
*.lan.poldebra.me {
|
|
tls {
|
|
dns namecheap {
|
|
api_key {env.NAMECHEAP_API_KEY}
|
|
user {env.NAMECHEAP_API_USER}
|
|
api_endpoint https://api.namecheap.com/xml.response
|
|
}
|
|
resolvers 1.1.1.1 8.8.8.8
|
|
}
|
|
|
|
@ollama host ollama.lan.poldebra.me
|
|
handle @ollama {
|
|
header {
|
|
X-Real-IP {remote_host}
|
|
X-Forwarded-For {remote_host}
|
|
X-Forwarded-Proto {scheme}
|
|
X-Forwarded-Host {host}
|
|
X-Forwarded-Port {server_port}
|
|
}
|
|
reverse_proxy 172.23.0.5:8080 {
|
|
header_up X-Forwarded-Proto {scheme}
|
|
}
|
|
}
|
|
}
|