DockerDocker 部署
A CPU-only Docker image is provided for linux/amd64 and linux/arm64 (Apple Silicon, AWS Graviton).提供支持 linux/amd64 和 linux/arm64(Apple Silicon、AWS Graviton)的纯 CPU Docker 镜像。
Build构建
bash
# x86_64
docker buildx build --platform=linux/amd64 \
-f docker/Dockerfile.cpu -t obeaver-cpu .
# arm64 (Apple Silicon / Graviton — compiles ORT from source)
docker buildx build --platform=linux/arm64 \
-f docker/Dockerfile.cpu -t obeaver-cpu .
Build Arguments构建参数
| Build Arg | Default | Description |
|---|---|---|
PYTHON_VERSION | 3.12 | Python version |
ORT_GENAI_REF | main | onnxruntime-genai git tag/branch (arm64 source build) |
Run运行
Interactive Chat交互式对话
bash
docker run -it --rm \
-v /path/to/models/phi3-mini-int4:/models \
obeaver-cpu run /models -E ort
API ServerAPI 服务
bash
docker run -d --rm -p 18000:18000 \
-v /path/to/models/phi3-mini-int4:/models \
obeaver-cpu serve /models -E ort --host 0.0.0.0 --port 18000
Verify the server is running:验证服务是否已启动:
bash
curl http://localhost:18000/health
Environment Variables环境变量
| Variable | Default | Description |
|---|---|---|
OMP_NUM_THREADS | 4 | OpenMP thread count |
MKL_NUM_THREADS | 4 | MKL thread count |
TOKENIZERS_PARALLELISM | false | Disable HF tokenizer parallelism warning |
OBEAVER_DEFAULT_ENGINE | ort | Default engine when --engine omitted |
Tip: For performance tuning, increase OMP_NUM_THREADS to match the number of CPU cores available to your container.提示:要调优性能,可将 OMP_NUM_THREADS 增加到与容器可用的 CPU 核心数相匹配。