DockerDocker 部署

A CPU-only Docker image is provided for linux/amd64 and linux/arm64 (Apple Silicon, AWS Graviton).提供支持 linux/amd64linux/arm64(Apple Silicon、AWS Graviton)的纯 CPU Docker 镜像。

Build构建

bash
# x86_64
docker buildx build --platform=linux/amd64 \
  -f docker/Dockerfile.cpu -t obeaver-cpu .

# arm64 (Apple Silicon / Graviton — compiles ORT from source)
docker buildx build --platform=linux/arm64 \
  -f docker/Dockerfile.cpu -t obeaver-cpu .

Build Arguments构建参数

Build ArgDefaultDescription
PYTHON_VERSION3.12Python version
ORT_GENAI_REFmainonnxruntime-genai git tag/branch (arm64 source build)

Run运行

Interactive Chat交互式对话

bash
docker run -it --rm \
  -v /path/to/models/phi3-mini-int4:/models \
  obeaver-cpu run /models -E ort

API ServerAPI 服务

bash
docker run -d --rm -p 18000:18000 \
  -v /path/to/models/phi3-mini-int4:/models \
  obeaver-cpu serve /models -E ort --host 0.0.0.0 --port 18000

Verify the server is running:验证服务是否已启动:

bash
curl http://localhost:18000/health

Environment Variables环境变量

VariableDefaultDescription
OMP_NUM_THREADS4OpenMP thread count
MKL_NUM_THREADS4MKL thread count
TOKENIZERS_PARALLELISMfalseDisable HF tokenizer parallelism warning
OBEAVER_DEFAULT_ENGINEortDefault engine when --engine omitted

Tip: For performance tuning, increase OMP_NUM_THREADS to match the number of CPU cores available to your container.提示:要调优性能,可将 OMP_NUM_THREADS 增加到与容器可用的 CPU 核心数相匹配。