Description
ai/gemma4:latest fails to load using Docker Model Runner on macOS Apple Silicon.
The failure occurs before a response is generated. It affects text prompts submitted through docker model run gemma4.
Reproduction
$ docker model run gemma4
> test
Actual behavior
background model preload failed: preload failed: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: 7 libmtmd.0.0.1.dylib 0x00000001003961f8 _ZN17clip_model_loader6warmupER8clip_ctx + 348
8 libmtmd.0.0.1.dylib 0x00000001003856d8 _Z9clip_initPKc19clip_context_params + 320
9 libmtmd.0.0.1.dylib 0x0000000100321c28 _ZN12mtmd_contextC2EPKcPK11llama_modelRK19mtmd_context_paramsb + 532
10 libmtmd.0.0.1.dylib 0x0000000100321700 _Z21mtmd_get_memory_usagePKc19mtmd_context_params + 100
11 libllama-server-impl.dylib 0x00000001012a5b4c _ZN19server_context_impl10load_modelER13common_params + 428
12 libllama-server-impl.dylib 0x00000001011eb0d8 _Z12llama_serveriPPc + 15120
13 dyld 0x000000018c503e00 start + 6992
ge_f32_batch + 748
5 libmtmd.0.0.1.dylib 0x00000001003a2ffc _ZN17clip_model_loader20reserve_compute_metaER8clip_ctxRK20clip_image_f32_batch + 256
6 libmtmd.0.0.1.dylib 0x000000010039da38 _ZN17clip_model_loader6warmupER8clip_ctxRK20clip_image_f32_batch + 148
Expected behavior
The model loads and generates a response.
Environment
- macOS 26.5.1, build 25F80
- Architecture: arm64
- Hardware: Apple M1 Pro
- Docker Desktop: 4.77.0
- Docker Engine client: 29.5.3
- Docker Model Runner client/server: v1.2.1
- Backend:
llama.cpp latest-metal
- Backend digest:
sha256:ad3d77500e8b5917f66a575d364a1264fd4eb999740bb1f52731f76b507c1cd9
- Backend version:
65ef50a
- Model:
docker.io/ai/gemma4:latest
- Model digest:
sha256:44aa4db60f30abffcbd669214e3949d685d5e7f384e0637fc4b7c56702e24a43
Additional information
The problem remains after uninstalling and reinstalling Docker Desktop.
Other locally installed models can generate responses. For example, llama3.1 works in the same environment.
Description
ai/gemma4:latestfails to load using Docker Model Runner on macOS Apple Silicon.The failure occurs before a response is generated. It affects text prompts submitted through
docker model run gemma4.Reproduction
Actual behavior
Expected behavior
The model loads and generates a response.
Environment
llama.cpp latest-metalsha256:ad3d77500e8b5917f66a575d364a1264fd4eb999740bb1f52731f76b507c1cd965ef50adocker.io/ai/gemma4:latestsha256:44aa4db60f30abffcbd669214e3949d685d5e7f384e0637fc4b7c56702e24a43Additional information
The problem remains after uninstalling and reinstalling Docker Desktop.
Other locally installed models can generate responses. For example,
llama3.1works in the same environment.