# Volcano scheduler config tuned for AKS cluster autoscaler scale-from-zero. # # The AzureML Kubernetes extension installs Volcano with the `overcommit` and # `proportion` plugins in the third tier. Both implement `JobEnqueueable`, which # the `enqueue` action calls to decide whether a PodGroup may transition from # Pending to Inqueue. Their decision is based on currently-Ready node capacity # only (proportion: queue.Allocated + queue.Free, overcommit: total * factor), # so on a cluster whose GPU node pools sit at count=0 they always return false: # no PodGroup is enqueued, Volcano never creates the underlying Pod, no Pending # Pod is visible to the AKS cluster autoscaler, and the GPU pool is never # scaled up — a self-induced deadlock. # # Removing both plugins makes `enqueue` permissive (default Permit when no # plugin objects), so Volcano creates the Pod immediately. The Pod is Pending # on a missing nvidia.com/gpu node, the autoscaler scales the pool from 0 to 1, # and once the node is Ready the `allocate` action binds the Pod. Gang # scheduling (the `gang` plugin in the second tier) still gates `allocate`, so # multi-pod jobs continue to wait for minAvailable before any task starts. # # Trade-off: queue-level capacity fairness across multiple PodGroups is no # longer enforced at enqueue time. Acceptable on single-tenant dev/training # clusters; re-enable proportion/overcommit on multi-tenant production clusters # (pass --enforce-volcano-capacity-check to 02-deploy-azureml-extension.sh). actions: "enqueue, allocate, backfill" tiers: - plugins: - name: priority enableJobStarving: false - name: conformance - plugins: - name: gang - name: drf enablePreemptable: false - plugins: - name: predicates - name: nodeorder - name: binpack