|
| 1 | +# Snapdragon Accelerated llama.cpp Android GUI Wrapper |
| 2 | + |
| 3 | +This project is a Jetpack Compose Android GUI for running a prebuilt `llama-server` executable from llama.cpp on Android devices. It does not compile llama.cpp source code inside this Gradle project and it does not call llama.cpp through JNI. Instead, the app packages prebuilt native artifacts, restores them into app-private storage at runtime, and launches `llama-server` as a child process. |
| 4 | + |
| 5 | +The wrapper is designed around Snapdragon-oriented builds of llama.cpp, including CPU, OpenCL, and Hexagon/HTP acceleration libraries. |
| 6 | + |
| 7 | +## What this app does |
| 8 | + |
| 9 | +- Provides a Compose UI for selecting a model and configuring server options. |
| 10 | +- Persists server configuration to app-private JSON storage. |
| 11 | +- Packages prebuilt llama.cpp native libraries and the `llama-server` executable into the APK. |
| 12 | +- Extracts or symlinks those artifacts into app-private runtime directories. |
| 13 | +- Starts and stops `llama-server` with `ProcessBuilder`. |
| 14 | +- Monitors server output to report whether the HTTP server is running. |
| 15 | + |
| 16 | +## Project layout |
| 17 | + |
| 18 | +```text |
| 19 | +android-gui-wrapper/ |
| 20 | +├── app/ |
| 21 | +│ ├── build.gradle.kts |
| 22 | +│ └── src/main/ |
| 23 | +│ ├── AndroidManifest.xml |
| 24 | +│ ├── java/com/example/llamaserver/ |
| 25 | +│ │ ├── MainActivity.kt |
| 26 | +│ │ ├── LlamaServerApplication.kt |
| 27 | +│ │ ├── data/ |
| 28 | +│ │ ├── di/ |
| 29 | +│ │ ├── model/ |
| 30 | +│ │ ├── repository/ |
| 31 | +│ │ ├── service/ |
| 32 | +│ │ ├── ui/ |
| 33 | +│ │ ├── util/ |
| 34 | +│ │ └── viewmodel/ |
| 35 | +│ ├── jniLibs/arm64-v8a/ |
| 36 | +│ └── res/ |
| 37 | +├── build.gradle.kts |
| 38 | +├── copy-native-libs.sh |
| 39 | +├── Dockerfile |
| 40 | +├── gradle.properties |
| 41 | +├── gradle/ |
| 42 | +└── settings.gradle.kts |
| 43 | +``` |
| 44 | + |
| 45 | +Important files: |
| 46 | + |
| 47 | +- `copy-native-libs.sh` copies prebuilt llama.cpp artifacts into `app/src/main/jniLibs/arm64-v8a`. |
| 48 | +- `app/src/main/java/com/example/llamaserver/util/BinaryExtractor.kt` prepares the packaged executable and libraries for runtime use. |
| 49 | +- `app/src/main/java/com/example/llamaserver/service/ServerProcessManager.kt` builds the `llama-server` command and starts the process. |
| 50 | +- `app/src/main/java/com/example/llamaserver/ui/launcher/AllConfigScreen.kt` contains the main configuration UI. |
| 51 | +- `app/src/main/java/com/example/llamaserver/ui/runtime/RuntimeScreen.kt` contains the runtime status, start/stop controls, and logs. |
| 52 | +- `app/src/main/java/com/example/llamaserver/model/ServerConfig.kt` defines the app's configuration contract with `llama-server`. |
| 53 | + |
| 54 | +## Build prerequisites |
| 55 | + |
| 56 | +Before building this Android app, the llama.cpp native artifacts must already exist under: |
| 57 | + |
| 58 | +```text |
| 59 | +../pkg-snapdragon/llama.cpp/ |
| 60 | +├── bin/llama-server |
| 61 | +└── lib/*.so |
| 62 | +``` |
| 63 | + |
| 64 | +The wrapper expects these artifacts to have been produced separately from the main llama.cpp build system. This Android project only packages and launches them. |
| 65 | + |
| 66 | +After building or installing the native artifacts, run the copy script from this folder or from the llama.cpp root: |
| 67 | + |
| 68 | +```bash |
| 69 | +./android-gui-wrapper/copy-native-libs.sh |
| 70 | +``` |
| 71 | + |
| 72 | +The script copies libraries into: |
| 73 | + |
| 74 | +```text |
| 75 | +app/src/main/jniLibs/arm64-v8a/ |
| 76 | +``` |
| 77 | + |
| 78 | +It also copies `bin/llama-server` as: |
| 79 | + |
| 80 | +```text |
| 81 | +app/src/main/jniLibs/arm64-v8a/libllama-server.so |
| 82 | +``` |
| 83 | + |
| 84 | +This rename is intentional: Android automatically packages and extracts `.so` files from `jniLibs`, so the executable is stored with a library-style filename during packaging and restored to `llama-server` at runtime. |
| 85 | + |
| 86 | +## Building the APK |
| 87 | + |
| 88 | +The APK can be built inside the Android builder container with: |
| 89 | + |
| 90 | +```bash |
| 91 | +docker run --rm \ |
| 92 | + -v /d/llama.cpp/android-gui-wrapper:/source \ |
| 93 | + -w /source \ |
| 94 | + llama-android-builder:latest \ |
| 95 | + bash -c "rm -rf .gradle app/.gradle app/build && export ANDROID_HOME=/opt/android-sdk && gradle assembleDebug --no-daemon --parallel" |
| 96 | +``` |
| 97 | + |
| 98 | +The debug APK is produced at: |
| 99 | + |
| 100 | +```text |
| 101 | +app/build/outputs/apk/debug/app-debug.apk |
| 102 | +``` |
| 103 | + |
| 104 | +The Gradle project is a single Android application module: |
| 105 | + |
| 106 | +- root project: `Snapdragon Accelerated lcpp` |
| 107 | +- module: `:app` |
| 108 | +- namespace/application ID: `tridefender.llama.snapdragon` |
| 109 | +- ABI: `arm64-v8a` |
| 110 | +- compile SDK: 34 |
| 111 | +- min SDK: 24 |
| 112 | + |
| 113 | +## Release signing |
| 114 | + |
| 115 | +Release builds require a signing keystore. The keystore and credentials are **not** included in the repository — you must generate your own. |
| 116 | + |
| 117 | +### Setup |
| 118 | + |
| 119 | +1. **Generate a keystore** (one-time): |
| 120 | + |
| 121 | + Linux/macOS: |
| 122 | + ```bash |
| 123 | + ./generate-keystore.sh |
| 124 | + ``` |
| 125 | + |
| 126 | + Windows: |
| 127 | + ```cmd |
| 128 | + generate-keystore.bat |
| 129 | + ``` |
| 130 | + |
| 131 | +2. **Create your signing config**: |
| 132 | + |
| 133 | + ```bash |
| 134 | + cp keystore.properties.example keystore.properties |
| 135 | + ``` |
| 136 | + |
| 137 | + Edit `keystore.properties` and fill in the password you chose during keystore generation. |
| 138 | + |
| 139 | +3. **Build a signed release APK**: |
| 140 | + |
| 141 | + ```bash |
| 142 | + docker run --rm \ |
| 143 | + -v /d/llama.cpp/android-gui-wrapper:/source \ |
| 144 | + -w /source \ |
| 145 | + llama-android-builder:latest \ |
| 146 | + bash -c "rm -rf app/build && export ANDROID_HOME=/opt/android-sdk && gradle assembleRelease --no-daemon" |
| 147 | + ``` |
| 148 | + |
| 149 | + The signed APK is produced at `app/build/outputs/apk/release/app-release.apk`. |
| 150 | + |
| 151 | +> **Note:** `keystore.properties` and `*.keystore` files are gitignored. Debug builds work without signing config (they fall back to the default debug key). |
| 152 | +
|
| 153 | +## Runtime flow |
| 154 | + |
| 155 | +At app startup, `MainActivity` opens the Compose UI: |
| 156 | + |
| 157 | +```text |
| 158 | +MainActivity |
| 159 | +└── MainScreen |
| 160 | + ├── Config tab -> AllConfigScreen |
| 161 | + └── Runtime tab -> RuntimeScreen |
| 162 | +``` |
| 163 | + |
| 164 | +When the user starts the server: |
| 165 | + |
| 166 | +```text |
| 167 | +RuntimeScreen |
| 168 | +└── RuntimeViewModel.startServer() |
| 169 | + └── ServerProcessManager.startServer(config) |
| 170 | + ├── BinaryExtractor.ensureAllAvailable(context) |
| 171 | + ├── build llama-server command line |
| 172 | + ├── configure process environment |
| 173 | + ├── ProcessBuilder(...).start() |
| 174 | + └── read stdout/stderr for server status |
| 175 | +``` |
| 176 | + |
| 177 | +`BinaryExtractor` restores runtime files under app-private storage: |
| 178 | + |
| 179 | +```text |
| 180 | +filesDir/bin/llama-server |
| 181 | +filesDir/lib/*.so |
| 182 | +``` |
| 183 | + |
| 184 | +It first attempts to create symlinks from Android's extracted native library directory. If symlinking fails, it copies the files. The executable is marked with `chmod 755`. |
| 185 | + |
| 186 | +`ServerProcessManager` configures library search paths before launching the process: |
| 187 | + |
| 188 | +```text |
| 189 | +LD_LIBRARY_PATH=<filesDir>:<filesDir>/lib:/system/lib64:/system/vendor/lib64:/vendor/lib64:/vendor/dsp/cdsp |
| 190 | +ADSP_LIBRARY_PATH=<filesDir>/lib |
| 191 | +GGML_HEXAGON_EXPERIMENTAL=1 # for HTP devices |
| 192 | +``` |
| 193 | + |
| 194 | +## Configuration model |
| 195 | + |
| 196 | +`ServerConfig` controls the generated `llama-server` command. It includes options for: |
| 197 | + |
| 198 | +- model path |
| 199 | +- context size |
| 200 | +- batch size |
| 201 | +- prediction tokens |
| 202 | +- embedding mode and pooling type |
| 203 | +- device type: CPU, OpenCL, or HTP0-HTP4 |
| 204 | +- KV cache types |
| 205 | +- KV offload |
| 206 | +- flash attention |
| 207 | +- auto-fit settings |
| 208 | +- server port and bind behavior |
| 209 | +- API key |
| 210 | +- timeout |
| 211 | +- thread settings |
| 212 | +- continuous batching |
| 213 | + |
| 214 | +The active configuration is persisted as JSON at runtime by `ConfigRepositoryImpl`. |
| 215 | + |
| 216 | +## Native process command |
| 217 | + |
| 218 | +The generated command has this general shape: |
| 219 | + |
| 220 | +```text |
| 221 | +filesDir/bin/llama-server |
| 222 | + --model <model-path> |
| 223 | + --poll 1000 |
| 224 | + --ctx-size <context-size> |
| 225 | + --batch-size <batch-size> |
| 226 | + --cache-type-k <type> |
| 227 | + --cache-type-v <type> |
| 228 | + [--device opencl|htp0|htp1|htp2|htp3|htp4] |
| 229 | + [--flash-attn on|off] |
| 230 | + [-ngl 99] |
| 231 | + [--kv-offload] |
| 232 | + [--cont-batching] |
| 233 | + [--fit on --fit-target <MiB> --fit-ctx <tokens>] |
| 234 | + [--port <port>] |
| 235 | + [--host 0.0.0.0] |
| 236 | + [--api-key <key>] |
| 237 | + [--timeout <seconds>] |
| 238 | + [--predict <tokens>] |
| 239 | +``` |
| 240 | + |
| 241 | +The app treats stdout/stderr as the process status channel. It marks the server as running when output contains messages such as `HTTP server is listening` or `llama server listening`. |
| 242 | + |
| 243 | +## Notes and known sharp edges |
| 244 | + |
| 245 | +- `app/src/androidTest/java/tridefender/llama/snapdragon/util/BinaryExtractorTest.kt` appears to describe an older `BinaryExtractor` API. The current implementation is an `object` with `ensureAllAvailable`, `getBinaryPath`, and `getLibraryDir`. |
| 246 | +- This wrapper only packages `arm64-v8a` artifacts. Other ABIs would need matching native builds and Gradle configuration changes. |
| 247 | +- Model files are selected through Android's document picker. `UriUtils` attempts to resolve `content://` URIs to file paths and falls back to copying where possible. |
| 248 | + |
| 249 | +## Development guide |
| 250 | + |
| 251 | +Common places to edit: |
| 252 | + |
| 253 | +- Add or change UI settings: `AllConfigScreen.kt`, `ModelConfigViewModel.kt`, and `ServerConfig.kt` |
| 254 | +- Change generated llama-server flags: `ServerProcessManager.kt` |
| 255 | +- Change binary/library preparation: `BinaryExtractor.kt` and `copy-native-libs.sh` |
| 256 | +- Change runtime controls or logs: `RuntimeScreen.kt` and `RuntimeViewModel.kt` |
| 257 | +- Change Gradle packaging or ABI behavior: `app/build.gradle.kts` |
| 258 | + |
| 259 | +Keep in mind that the Android app is only a wrapper. Any llama.cpp feature used here must be supported by the prebuilt `llama-server` binary that is packaged into the APK. |
0 commit comments