@heisenbug4242

heisenbug4242@lemmy.world · 2 months ago

If you’re looking for a web UI and a simple way to host one yourself, nothing beats the “llama.cpp” project. They include a “llama-server” program which hosts a simple web server (with a chat webapp) and OpenAI-compatible API endpoint. It now also supports multimodality (for models that support multimodality), meaning you can for example upload an image and ask the assistant to describe the image. An example command to set up such a web server would be:

$ llama-server --threads 6 -m /path/to/model.gguf

Or, for multimodality support (like asking an AI to describe an image), use:

$ llama-server --threads 6 --mmproj /path/to/model/mmproj-F16.gguf -m /path/to/model/model.gguf