Supported commands: fab job -h
Job instances:
Job scheduling:
Run jobs synchronously, waiting for completion.
You can specify the cli wait timeout in seconds using the --timeout
flag.
fab:/$ job run ws1.Workspace/nb1.Notebook
fab:/$ job run ws1.Workspace/pip1.DataPipeline
# Run a job with a 120 second (2 minute) timeout
# (The job will be cancelled by default when timeout is reached)
fab:/$ job run ws1.Workspace/sjd1.SparkJobDefinition --timeout 120
# Run a job with a 120 second (2 minute) timeout
# (The job will continue running even if timeout is reached)
fab:/$ config set job_cancel_ontimeout false
fab:/$ job run ws1.Workspace/sjd1.SparkJobDefinition --timeout 120
Run jobs asynchronously, without waiting for completion.
fab:/$ job start ws1.Workspace/nb1.Notebook
fab:/$ job start ws1.Workspace/pip1.DataPipeline
fab:/$ job start ws1.Workspace/sjd1.SparkJobDefinition
Run .Notebook
or .DataPipeline
jobs using parameters -P
.
Supported parameter types:
.Notebook
: string
, int
, float
and bool
.DataPipeline
: string
, int
, float
, bool
, object
, array
and secureString
# notebook with custom params
fab:/$ job run ws1.Workspace/nb1.Notebook -P string_param:string=new_value,int_param:int=10,float_param:float=0.1234,bool_param:bool=true
# pipeline with custom params
fab:/$ job run ws1.Workspace/pip1.DataPipeline -P string_param:string=new_value,int_param:int=10,float_param:float=0.1234,bool_param:bool=true
# pipeline with custom params (object, array and secureString)
fab:/$ job run ws1.Workspace/pip1.DataPipeline -P obj_param:object={"key":{"key":2},"key":"value"},array_param:array=[1,2,3],secstr_param:secureString=secret
Run synchronous or asynchronous notebook jobs with optional config -C
, parameters -P
, or input -i
. -i
cannot be used with either -C
or -P
.
# async passing configuration using JSON file
fab:/$ job start ws1.Workspace/nb1.Notebook -C ./config_file.json
# async passing configuration using inline JSON
fab:/$ job start ws1.Workspace/nb1.Notebook -C { "conf":{"spark.conf1": "value"}, "environment": { "id": "<environment_id>", "name": "<environment_name>" } }
# sync passing configuration using inline JSON
fab:/$ job run ws1.Workspace/nb1.Notebook -C { "defaultLakehouse": { "name": "<lakehouse-name>", "id": "<lakehouse-id>", "workspaceId": "<(optional) workspace-id-that-contains-the-lakehouse>" }}
# async passing configuration using inline JSON
fab:/$ job start ws1.Workspace/nb1.Notebook -C { "defaultLakehouse": { "name": "<lakehouse-name>", "id": "<lakehouse-id>" }, "useStarterPool": false, "useWorkspacePool": "<workspace-pool-name>" }
# sync passing inline configuration and parameters
fab:/$ job run ws1.Workspace/nb1.Notebook -P string_param:string=new_value -C {"environment": { "id": "<environment_id>", "name": "<environment_name>" }}
# sync using inline JSON (raw)
fab:/$ job run ws1.Workspace/nb1.Notebook -i {"parameters": {"string_param": {"type": "string", "value": "new_value"}}, "configuration": {"conf":{"spark.conf1": "value"}}}
# sync using JSON file (raw)
fab:/$ job run ws1.Workspace/nb1.Notebook -i ./input_file.json
Explore notebook run payload options.
Run synchronous or asynchronous pipeline jobs with optional parameters -P
, or input -i
. -i
cannot be used with -P
.
# async using inline JSON (raw)
$ job start ws1.Workspace/pip1.DataPipeline -i {"parameters": {"string_param": "new_value", "int_param": 10}}
# sync using inline JSON (raw)
$ job run ws1.Workspace/pip1.DataPipeline -i {"parameters": {"float_param": 0.1234, "bool_param": true, "obj_param": {"key": "value"}, "array_param": [1, 2, 3], "secstr_param": "secret"}}
Explore pipeline run payload options.
Run synchronous or asynchronous Spark Job Definition jobs with input -i
.
# sync SJD job run
$ job run ws1.Workspace/sjd1.SparkJobDefinition
# async SJD job run with payload definition
$ job start ws1.Workspace/sjd1.SparkJobDefinition -i { "commandLineArguments": "param01 TEST param02 1234", "environmentArtifactId": "<environment-id>", "defaultLakehouseArtifactId": "<lakehouse-id>", "additionalLakehouseIds": ["<lakehouse-id>"] }
Run synchronous or asynchronous table jobs with input -i
.
# run vorder, zorder and vacuum
$ job run ws1.Workspace/lh1.Lakehouse -i {'tableName': 'orders', 'optimizeSettings': {'vOrder': true, 'zOrderBy': ['account_id']}, 'vacuumSettings': {'retentionPeriod': '7.01:00:00'}}
# run vorderand zorder
$ job run ws1.Workspace/lh1.Lakehouse -i {'tableName': 'orders', 'optimizeSettings': {'vOrder': true, 'zOrderBy': ['account_id']}}
# run vacuum
$ job run ws1.Workspace/lh1.Lakehouse -i {'tableName': 'orders', 'vacuumSettings': {'retentionPeriod': '7.01:00:00'}}
Note: see Managing Tables for friendly options (recommended).
List job runs for an item.
Supported items:
.Notebook
.DataPipeline
.SparkJobDefinition
.Lakehouse
fab:/$ job run-list ws1.Workspace/nb1.Notebook
Get job instance status for an item with --id
.
fab:/$ job run-status ws1.Workspace/nb1.Notebook --id 3cf84ce6-3706-4017-a68f-e26b9ca3238c
Cancel job instance for an item with --id
.
fab:/$ job run-cancel ws1.Workspace/nb1.Notebook --id 3cf84ce6-3706-4017-a68f-e26b9ca32300
Schedule Job for Item (Default: disabled).
# run every 10 minutes and enable it
fab:/$ job run-sch pip1.DataPipeline --type cron --interval 10 --start 2024-11-15T09:00:00 --end 2024-12-15T10:00:00 --enable
# run every day at 10:00 and 16:00 (disabled by default)
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --type daily --interval 10:00,16:00 --start 2024-11-15T09:00:00 --end 2024-12-16T10:00:00
# run every Monday and Friday at 10:00 and 16:00, disabled by default
fab:/$ job run-sch pip1.DataPipeline --type weekly --interval 10:00,16:00 --days Monday,Friday --start 2024-11-15T09:00:00 --end 2024-12-16T10:00:00
# set up pipeline schedule with custom input
fab:/$ job run-sch pip1.DataPipeline -i {'enabled': true, 'configuration': {'startDateTime': '2024-04-28T00:00:00', 'endDateTime': '2024-04-30T23:59:00', 'localTimeZoneId': 'Central Standard Time', 'type': 'Cron', 'interval': 10}}
Schedule a job for an item.
# disable pipeline schedule
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --disable
# update pipeline schedule to run every 10 minutes and enable it
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --type cron --interval 10 --start 2024-11-15T09:00:00 --end 2024-12-15T10:00:00 --enable
$ update pipeline schedule to run every day at 10:00 and 16:00 (maintain the existing enabled state)
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --type daily --interval 10:00,16:00 --start 2024-11-15T09:00:00 --end 2024-12-16T10:00:00
# update pipeline schedule to run every Monday and Friday at 10:00 and 16:00 and enable it
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --type weekly --interval 10:00,16:00 --days Monday,Friday --start 2024-11-15T09:00:00 --end 2024-12-16T10:00:00 --enable
# update pipeline schedule with custom input
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> -i {'enabled': true, 'configuration': {'startDateTime': '2024-04-28T00:00:00', 'endDateTime': '2024-04-30T23:59:00', 'localTimeZoneId': 'Central Standard Time', 'type': 'Cron', 'interval': 10}}
Listing scheduled job runs with --schedule
flag.
fab:/$ job run-list ws1.Workspace/nb1.Notebook --schedule
Get the status of a schedule job with schedule id --id
and --schedule
flag.
fab:/$ job run-status ws1.Workspace/nb1.Notebook --id 2cf34ce6-3706-4347-a68f-e26b9ca3567n --schedule
For disabling a Job schedule, issue a run-update
command with schedule id --id
and --disable
flag.
# disable pipeline schedule
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --disable