API ReferenceAPI 参考
oBeaver exposes an OpenAI-compatible HTTP API. Point any OpenAI SDK or compatible client at your local server.oBeaver 提供 OpenAI 兼容的 HTTP API。将任意 OpenAI SDK 或兼容客户端指向本地服务器即可。
Chat Server (obeaver serve)聊天服务 (obeaver serve)
| Method方法 | Path路径 | Description说明 |
|---|---|---|
GET | / | API server info and documentation linkAPI 服务信息及文档链接 |
GET | /health | Health check (includes model name and engine label)健康检查(包含模型名称和引擎标识) |
GET | /v1/models | List loaded model列出已加载模型 |
POST | /v1/chat/completions | Chat completions (streaming + non-streaming)对话补全(支持流式与非流式) |
GET | /api/system/memory | CPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计 |
GET | /api/models/available | List all cached models available for switching列出所有可切换的已缓存模型 |
POST | /api/models/load | Hot-swap the active model at runtime运行时热切换活跃模型 |
Embedding Server (obeaver serve-embed)嵌入服务 (obeaver serve-embed)
| Method方法 | Path路径 | Description说明 |
|---|---|---|
GET | / | API server info and documentation linkAPI 服务信息及文档链接 |
GET | /health | Health check健康检查 |
GET | /v1/models | List loaded model列出已加载模型 |
POST | /v1/embeddings | Compute text embeddings计算文本嵌入向量 |
GET | /api/system/memory | CPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计 |
Dashboard Server (obeaver dashboard)仪表盘服务 (obeaver dashboard)
The web dashboard (chat UI + real-time memory gauges) is only available via obeaver dashboard. It is not served by obeaver serve or obeaver serve-embed.网页仪表盘(聊天界面 + 实时内存仪表)仅通过 obeaver dashboard 提供,obeaver serve 和 obeaver serve-embed 不包含仪表盘。
| Method方法 | Path路径 | Description说明 |
|---|---|---|
GET | / | Redirect to the web dashboard重定向到网页仪表盘 |
GET | /health | Health check (includes model name and engine label)健康检查(包含模型名称和引擎标识) |
GET | /v1/models | List loaded model列出已加载模型 |
POST | /v1/chat/completions | Chat completions (streaming + non-streaming)对话补全(支持流式与非流式) |
GET | /api/system/memory | CPU, GPU, NPU, and process memory statisticsCPU、GPU、NPU 及进程内存统计 |
GET | /api/models/available | List all cached models available for switching列出所有可切换的已缓存模型 |
POST | /api/models/load | Hot-swap the active model at runtime运行时热切换活跃模型 |
GET | /static/index.html | Web dashboard UI with real-time memory gauges and chat网页仪表盘 UI,含实时内存仪表和聊天界面 |
POST /v1/chat/completionsPOST /v1/chat/completions
Create a chat completion. Supports both streaming (SSE) and non-streaming modes.创建对话补全。支持流式(SSE)和非流式模式。
Request Parameters请求参数
| Parameter参数 | Type类型 | Default默认值 | Description说明 |
|---|---|---|---|
model | string | "" | Model identifier (informational)模型标识符(信息性) |
messages | array | (required)(必填) | OpenAI message array (role + content)OpenAI 消息数组(role + content) |
stream | bool | false | Enable SSE streaming启用 SSE 流式传输 |
max_tokens | int | 1024 | Max tokens to generate最大生成 token 数 |
temperature | float | 1.0 | Sampling temperature采样温度 |
top_p | float | 1.0 | Nucleus sampling核采样 |
top_k | int | 50 | Top-k samplingTop-k 采样 |
repetition_penalty | float | 1.0 | Repetition penalty (1.0 = off)重复惩罚(1.0 = 关闭) |
tools | array | null | OpenAI function definitionsOpenAI 函数定义 |
tool_choice | string|object | null | Tool selection strategy工具选择策略 |
Non-streaming Example非流式示例
curl http://127.0.0.1:18000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Phi-4-mini",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is ONNX Runtime?"}
],
"max_tokens": 512,
"temperature": 0.7
}'
Streaming Example流式示例
curl http://127.0.0.1:18000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Phi-4-mini",
"messages": [{"role": "user", "content": "Tell me a joke"}],
"stream": true
}'
Tool Calling Example工具调用示例
curl http://127.0.0.1:18000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Phi-4-mini",
"messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}
}]
}'
POST /v1/embeddingsPOST /v1/embeddings
Compute text embeddings. Available on the embedding server (obeaver serve-embed).计算文本嵌入向量。在嵌入服务(obeaver serve-embed)上可用。
Request Body请求体
| Parameter参数 | Type类型 | Description说明 |
|---|---|---|
model | string | Model identifier (informational)模型标识符(信息性) |
input | string | array | Text string or array of strings to embed要嵌入的文本字符串或字符串数组 |
curl http://127.0.0.1:18001/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen3-Embedding-0.6B",
"input": ["Hello, world!", "Embeddings are useful."]
}'
GET /healthGET /health
Returns the health status of the server, including loaded model name and engine type.返回服务器健康状态,包含已加载的模型名称和引擎类型。
curl http://127.0.0.1:18000/health
GET /api/system/memoryGET /api/system/memory
Returns CPU, GPU, NPU, and process memory statistics as JSON. This powers the web dashboard's real-time gauges.以 JSON 格式返回 CPU、GPU、NPU 和进程内存统计信息。这是网页仪表盘实时仪表的数据源。
curl http://127.0.0.1:18000/api/system/memory
Model Management模型管理
GET /api/models/availableGET /api/models/available
Lists all cached models available for hot-swapping.列出所有可用于热切换的已缓存模型。
POST /api/models/loadPOST /api/models/load
Hot-swap the active model at runtime without restarting the server.运行时热切换活动模型,无需重启服务器。