LocalAI supports generating embeddings for text or list of tokens.

For the API documentation you can refer to the OpenAI docs: https://platform.openai.com/docs/api-reference/embeddings

Model compatibility

The embedding endpoint is compatible with llama.cpp models, bert.cpp models and sentence-transformers models available in huggingface.

Manual Setup

Create a YAML config file in the models directory. Specify the backend and the model file.

  name: text-embedding-ada-002 # The model name used in the API
parameters:
  model: <model_file>
backend: "<backend>"
embeddings: true
# .. other parameters
  

Bert embeddings

To use bert.cpp models you can use the bert embedding backend.

An example model config file:

  name: text-embedding-ada-002
parameters:
  model: bert
backend: bert-embeddings
embeddings: true
# .. other parameters
  

The bert backend uses bert.cpp and uses ggml models.

For instance you can download the ggml quantized version of all-MiniLM-L6-v2 from https://huggingface.co/skeskinen/ggml:

  wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
  

To test locally (LocalAI server running on localhost), you can use curl (and jq at the end to prettify):

  curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{
  "input": "Your text string goes here",
  "model": "text-embedding-ada-002"
}' | jq "."
  

Huggingface embeddings

To use sentence-transformers and models in huggingface you can use the sentencetransformers embedding backend.

  name: text-embedding-ada-002
backend: sentencetransformers
embeddings: true
parameters:
  model: all-MiniLM-L6-v2
  

The sentencetransformers backend uses Python sentence-transformers. For a list of all pre-trained models available see here: https://github.com/UKPLab/sentence-transformers#pre-trained-models

Llama.cpp embeddings

Embeddings with llama.cpp are supported with the llama backend.

  name: my-awesome-model
backend: llama
embeddings: true
parameters:
  model: ggml-file.bin
# ...
  

šŸ’” Examples

  • Example that uses LLamaIndex and LocalAI as embedding: here.

Last updated 19 Jan 2024, 19:23 +0100 . history