Backend Monitor

LocalAI provides endpoints to monitor and manage running backends. The /backend/monitor endpoint reports the status and resource usage of loaded models, and /backend/shutdown allows stopping a model’s backend process.

Monitor API

Method: GET
Endpoints: /backend/monitor, /v1/backend/monitor

Request

The request body is JSON:

Parameter	Type	Required	Description
`model`	`string`	Yes	Name of the model to monitor

Response

Returns a JSON object with the backend status:

Field	Type	Description
`state`	`int`	Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error
`memory`	`object`	Memory usage information
`memory.total`	`uint64`	Total memory usage in bytes
`memory.breakdown`	`object`	Per-component memory breakdown (key-value pairs)

If the gRPC status call fails, the endpoint falls back to local process metrics:

Field	Type	Description
`memory_info`	`object`	Process memory info (RSS, VMS)
`memory_percent`	`float`	Memory usage percentage
`cpu_percent`	`float`	CPU usage percentage

Usage

curl http://localhost:8080/backend/monitor \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model"}'

Example response

{
  "state": 2,
  "memory": {
    "total": 1073741824,
    "breakdown": {
      "weights": 536870912,
      "kv_cache": 268435456
    }
  }
}

Shutdown API

Method: POST
Endpoints: /backend/shutdown, /v1/backend/shutdown

Request

Parameter	Type	Required	Description
`model`	`string`	Yes	Name of the model to shut down

Usage

curl -X POST http://localhost:8080/backend/shutdown \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model"}'

Response

Returns 200 OK with the shutdown confirmation message on success.

Error Responses

Status Code	Description
400	Invalid or missing model name
500	Backend error or model not loaded