API ReferenceAPI 参考

oBeaver exposes an OpenAI-compatible HTTP API. Point any OpenAI SDK or compatible client at your local server.oBeaver 提供 OpenAI 兼容的 HTTP API。将任意 OpenAI SDK 或兼容客户端指向本地服务器即可。

Chat Server (obeaver serve)聊天服务 (obeaver serve)

Method方法Path路径Description说明
GET/API server info and documentation linkAPI 服务信息及文档链接
GET/healthHealth check (includes model name and engine label)健康检查(包含模型名称和引擎标识)
GET/v1/modelsList loaded model列出已加载模型
POST/v1/chat/completionsChat completions (streaming + non-streaming)对话补全(支持流式与非流式)
GET/api/system/memoryCPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计
GET/api/models/availableList all cached models available for switching列出所有可切换的已缓存模型
POST/api/models/loadHot-swap the active model at runtime运行时热切换活跃模型

Embedding Server (obeaver serve-embed)嵌入服务 (obeaver serve-embed)

Method方法Path路径Description说明
GET/API server info and documentation linkAPI 服务信息及文档链接
GET/healthHealth check健康检查
GET/v1/modelsList loaded model列出已加载模型
POST/v1/embeddingsCompute text embeddings计算文本嵌入向量
GET/api/system/memoryCPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计

Dashboard Server (obeaver dashboard)仪表盘服务 (obeaver dashboard)

The web dashboard (chat UI + real-time memory gauges) is only available via obeaver dashboard. It is not served by obeaver serve or obeaver serve-embed.网页仪表盘(聊天界面 + 实时内存仪表)通过 obeaver dashboard 提供,obeaver serveobeaver serve-embed 不包含仪表盘。

Method方法Path路径Description说明
GET/Redirect to the web dashboard重定向到网页仪表盘
GET/healthHealth check (includes model name and engine label)健康检查(包含模型名称和引擎标识)
GET/v1/modelsList loaded model列出已加载模型
POST/v1/chat/completionsChat completions (streaming + non-streaming)对话补全(支持流式与非流式)
GET/api/system/memoryCPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计
GET/api/models/availableList all cached models available for switching列出所有可切换的已缓存模型
POST/api/models/loadHot-swap the active model at runtime运行时热切换活跃模型
GET/static/index.htmlWeb dashboard UI with real-time memory gauges and chat网页仪表盘 UI,含实时内存仪表和聊天界面

POST /v1/chat/completionsPOST /v1/chat/completions

Create a chat completion. Supports both streaming (SSE) and non-streaming modes.创建对话补全。支持流式(SSE)和非流式模式。

Request Parameters请求参数

Parameter参数Type类型Default默认值Description说明
modelstring""Model identifier (informational)模型标识符(信息性)
messagesarray(required)(必填)OpenAI message array (role + content)OpenAI 消息数组(role + content
streamboolfalseEnable SSE streaming启用 SSE 流式传输
max_tokensint1024Max tokens to generate最大生成 token 数
temperaturefloat1.0Sampling temperature采样温度
top_pfloat1.0Nucleus sampling核采样
top_kint50Top-k samplingTop-k 采样
repetition_penaltyfloat1.0Repetition penalty (1.0 = off)重复惩罚(1.0 = 关闭)
toolsarraynullOpenAI function definitionsOpenAI 函数定义
tool_choicestring|objectnullTool selection strategy工具选择策略

Non-streaming Example非流式示例

bash
curl http://127.0.0.1:18000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Phi-4-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is ONNX Runtime?"}
    ],
    "max_tokens": 512,
    "temperature": 0.7
  }'

Streaming Example流式示例

bash
curl http://127.0.0.1:18000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Phi-4-mini",
    "messages": [{"role": "user", "content": "Tell me a joke"}],
    "stream": true
  }'

Tool Calling Example工具调用示例

bash
curl http://127.0.0.1:18000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Phi-4-mini",
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
          "type": "object",
          "properties": {"city": {"type": "string"}},
          "required": ["city"]
        }
      }
    }]
  }'

POST /v1/embeddingsPOST /v1/embeddings

Compute text embeddings. Available on the embedding server (obeaver serve-embed).计算文本嵌入向量。在嵌入服务(obeaver serve-embed)上可用。

Request Body请求体

Parameter参数Type类型Description说明
modelstringModel identifier (informational)模型标识符(信息性)
inputstring | arrayText string or array of strings to embed要嵌入的文本字符串或字符串数组
bash
curl http://127.0.0.1:18001/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3-Embedding-0.6B",
    "input": ["Hello, world!", "Embeddings are useful."]
  }'

GET /healthGET /health

Returns the health status of the server, including loaded model name and engine type.返回服务器健康状态,包含已加载的模型名称和引擎类型。

bash
curl http://127.0.0.1:18000/health

GET /api/system/memoryGET /api/system/memory

Returns CPU, GPU, NPU, and process memory statistics as JSON. This powers the web dashboard's real-time gauges.以 JSON 格式返回 CPU、GPU、NPU 和进程内存统计信息。这是网页仪表盘实时仪表的数据源。

bash
curl http://127.0.0.1:18000/api/system/memory

Model Management模型管理

GET /api/models/availableGET /api/models/available

Lists all cached models available for hot-swapping.列出所有可用于热切换的已缓存模型。

POST /api/models/loadPOST /api/models/load

Hot-swap the active model at runtime without restarting the server.运行时热切换活动模型,无需重启服务器。