API ReferenceAPI 参考

oBeaver exposes an OpenAI-compatible HTTP API. Point any OpenAI SDK or compatible client at your local server.oBeaver 提供 OpenAI 兼容的 HTTP API。将任意 OpenAI SDK 或兼容客户端指向本地服务器即可。

Chat Server (`obeaver serve`)聊天服务 (`obeaver serve`)

Method方法	Path路径	Description说明
`GET`	`/`	API server info and documentation linkAPI 服务信息及文档链接
`GET`	`/health`	Health check (includes model name and engine label)健康检查（包含模型名称和引擎标识）
`GET`	`/v1/models`	List loaded model列出已加载模型
`POST`	`/v1/chat/completions`	Chat completions (streaming + non-streaming)对话补全（支持流式与非流式）
`GET`	`/api/system/memory`	CPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计
`GET`	`/api/models/available`	List all cached models available for switching列出所有可切换的已缓存模型
`POST`	`/api/models/load`	Hot-swap the active model at runtime运行时热切换活跃模型

Embedding Server (`obeaver serve-embed`)嵌入服务 (`obeaver serve-embed`)

Method方法	Path路径	Description说明
`GET`	`/`	API server info and documentation linkAPI 服务信息及文档链接
`GET`	`/health`	Health check健康检查
`GET`	`/v1/models`	List loaded model列出已加载模型
`POST`	`/v1/embeddings`	Compute text embeddings计算文本嵌入向量
`GET`	`/api/system/memory`	CPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计

Dashboard Server (`obeaver dashboard`)仪表盘服务 (`obeaver dashboard`)

The web dashboard (chat UI + real-time memory gauges) is only available via obeaver dashboard. It is not served by obeaver serve or obeaver serve-embed.网页仪表盘（聊天界面 + 实时内存仪表）仅通过 obeaver dashboard 提供，obeaver serve 和 obeaver serve-embed 不包含仪表盘。

Method方法	Path路径	Description说明
`GET`	`/`	Redirect to the web dashboard重定向到网页仪表盘
`GET`	`/health`	Health check (includes model name and engine label)健康检查（包含模型名称和引擎标识）
`GET`	`/v1/models`	List loaded model列出已加载模型
`POST`	`/v1/chat/completions`	Chat completions (streaming + non-streaming)对话补全（支持流式与非流式）
`GET`	`/api/system/memory`	CPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计
`GET`	`/api/models/available`	List all cached models available for switching列出所有可切换的已缓存模型
`POST`	`/api/models/load`	Hot-swap the active model at runtime运行时热切换活跃模型
`GET`	`/static/index.html`	Web dashboard UI with real-time memory gauges and chat网页仪表盘 UI，含实时内存仪表和聊天界面

POST `/v1/chat/completions`POST `/v1/chat/completions`

Create a chat completion. Supports both streaming (SSE) and non-streaming modes.创建对话补全。支持流式（SSE）和非流式模式。

Request Parameters请求参数

Parameter参数	Type类型	Default默认值	Description说明
`model`	`string`	`""`	Model identifier (informational)模型标识符（信息性）
`messages`	`array`	(required)（必填）	OpenAI message array (`role` + `content`)OpenAI 消息数组（`role` + `content`）
`stream`	`bool`	`false`	Enable SSE streaming启用 SSE 流式传输
`max_tokens`	`int`	`1024`	Max tokens to generate最大生成 token 数
`temperature`	`float`	`1.0`	Sampling temperature采样温度
`top_p`	`float`	`1.0`	Nucleus sampling核采样
`top_k`	`int`	`50`	Top-k samplingTop-k 采样
`repetition_penalty`	`float`	`1.0`	Repetition penalty (1.0 = off)重复惩罚（1.0 = 关闭）
`tools`	`array`	`null`	OpenAI function definitionsOpenAI 函数定义
`tool_choice`	`string\|object`	`null`	Tool selection strategy工具选择策略

Non-streaming Example非流式示例

bash

curl http://127.0.0.1:18000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Phi-4-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is ONNX Runtime?"}
    ],
    "max_tokens": 512,
    "temperature": 0.7
  }'

Streaming Example流式示例

bash

curl http://127.0.0.1:18000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Phi-4-mini",
    "messages": [{"role": "user", "content": "Tell me a joke"}],
    "stream": true
  }'

Tool Calling Example工具调用示例

bash

curl http://127.0.0.1:18000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Phi-4-mini",
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
          "type": "object",
          "properties": {"city": {"type": "string"}},
          "required": ["city"]
        }
      }
    }]
  }'

POST `/v1/embeddings`POST `/v1/embeddings`

Compute text embeddings. Available on the embedding server (obeaver serve-embed).计算文本嵌入向量。在嵌入服务（obeaver serve-embed）上可用。

Request Body请求体

Parameter参数	Type类型	Description说明
`model`	`string`	Model identifier (informational)模型标识符（信息性）
`input`	`string \| array`	Text string or array of strings to embed要嵌入的文本字符串或字符串数组

bash

curl http://127.0.0.1:18001/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3-Embedding-0.6B",
    "input": ["Hello, world!", "Embeddings are useful."]
  }'

GET `/health`GET `/health`

Returns the health status of the server, including loaded model name and engine type.返回服务器健康状态，包含已加载的模型名称和引擎类型。

bash

curl http://127.0.0.1:18000/health

GET `/api/system/memory`GET `/api/system/memory`

Returns CPU, GPU, NPU, and process memory statistics as JSON. This powers the web dashboard's real-time gauges.以 JSON 格式返回 CPU、GPU、NPU 和进程内存统计信息。这是网页仪表盘实时仪表的数据源。

bash

curl http://127.0.0.1:18000/api/system/memory

Model Management模型管理

GET `/api/models/available`GET `/api/models/available`

Lists all cached models available for hot-swapping.列出所有可用于热切换的已缓存模型。

POST `/api/models/load`POST `/api/models/load`

Hot-swap the active model at runtime without restarting the server.运行时热切换活动模型，无需重启服务器。

API ReferenceAPI 参考

Chat Server (obeaver serve)聊天服务 (obeaver serve)

Embedding Server (obeaver serve-embed)嵌入服务 (obeaver serve-embed)

Dashboard Server (obeaver dashboard)仪表盘服务 (obeaver dashboard)

POST /v1/chat/completionsPOST /v1/chat/completions

Request Parameters请求参数

Non-streaming Example非流式示例

Streaming Example流式示例

Tool Calling Example工具调用示例

POST /v1/embeddingsPOST /v1/embeddings

Request Body请求体

GET /healthGET /health

GET /api/system/memoryGET /api/system/memory

Model Management模型管理

GET /api/models/availableGET /api/models/available

POST /api/models/loadPOST /api/models/load

Chat Server (`obeaver serve`)聊天服务 (`obeaver serve`)

Embedding Server (`obeaver serve-embed`)嵌入服务 (`obeaver serve-embed`)

Dashboard Server (`obeaver dashboard`)仪表盘服务 (`obeaver dashboard`)

POST `/v1/chat/completions`POST `/v1/chat/completions`

POST `/v1/embeddings`POST `/v1/embeddings`

GET `/health`GET `/health`

GET `/api/system/memory`GET `/api/system/memory`

GET `/api/models/available`GET `/api/models/available`

POST `/api/models/load`POST `/api/models/load`