Try it out
Once LocalAI is installed, you can start it (either by using docker, or the cli, or the systemd service).
By default the LocalAI WebUI should be accessible from http://localhost:8080. You can also use 3rd party projects to interact with LocalAI as you would use OpenAI (see also Integrations ).
After installation, install new models by navigating the model gallery, or by using the local-ai CLI.
Tip
To install models with the WebUI, see the Models section.
With the CLI you can list the models with local-ai models list and install them with local-ai models install <model-name>.
You can also run models manually by copying files into the models directory.
You can test chat models from the CLI without keeping a separate curl command around:
local-ai chat connects to a running LocalAI server, opens an interactive chat prompt, and exits when you type /exit, /quit, or /bye. Use /models to list installed models, /model <name> to switch models, and /clear to reset the current conversation. If the server exposes exactly one model, LocalAI uses that model automatically:
When more than one model is configured, pass --model with the installed model name to avoid ambiguity. Use --endpoint to connect to a non-default server, for example local-ai chat --endpoint http://127.0.0.1:8081 --model gpt-4.
You can also test out the API endpoints using curl, few examples are listed below. The models we are referring here (gpt-4, gpt-4-vision-preview, tts-1, whisper-1) are examples - replace them with the model names you have installed.
Text Generation
Creates a model response for the given chat conversation. OpenAI documentation.
GPT Vision
Understand images.
Function calling
Call functions
Anthropic Messages API
LocalAI supports the Anthropic Messages API for Claude-compatible models. Anthropic documentation.
Open Responses API
LocalAI supports the Open Responses API specification with support for background processing, streaming, and advanced features. Open Responses documentation.
For background processing:
Then retrieve the response:
Image Generation
Creates an image given a prompt. OpenAI documentation.
Text to speech
Generates audio from the input text. OpenAI documentation.
Audio Transcription
Transcribes audio into the input language. OpenAI Documentation.
Download first a sample to transcribe:
Send the example audio file to the transcriptions endpoint :
Embeddings Generation
Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. OpenAI Embeddings.
Tip
Don’t use the model file as model in the request unless you want to handle the prompt template for yourself.
Use the model names like you would do with OpenAI like in the examples below. For instance gpt-4-vision-preview, or gpt-4.