fabric-cli

job examples

Supported commands: fab job -h

Job instances:

Job scheduling:

Running jobs synchronously

Run jobs synchronously, waiting for completion. You can specify the cli wait timeout in seconds using the --timeout flag.

fab:/$ job run ws1.Workspace/nb1.Notebook 
fab:/$ job run ws1.Workspace/pip1.DataPipeline

# Run a job with a 120 second (2 minute) timeout
# (The job will be cancelled by default when timeout is reached)
fab:/$ job run ws1.Workspace/sjd1.SparkJobDefinition --timeout 120

# Run a job with a 120 second (2 minute) timeout
# (The job will continue running even if timeout is reached)
fab:/$ config set job_cancel_ontimeout false
fab:/$ job run ws1.Workspace/sjd1.SparkJobDefinition --timeout 120

Running jobs asynchronously

Run jobs asynchronously, without waiting for completion.

fab:/$ job start ws1.Workspace/nb1.Notebook 
fab:/$ job start ws1.Workspace/pip1.DataPipeline
fab:/$ job start ws1.Workspace/sjd1.SparkJobDefinition

Running jobs

Run .Notebook or .DataPipeline jobs using parameters -P.

Supported parameter types:

# notebook with custom params
fab:/$ job run ws1.Workspace/nb1.Notebook -P string_param:string=new_value,int_param:int=10,float_param:float=0.1234,bool_param:bool=true

# pipeline with custom params
fab:/$ job run ws1.Workspace/pip1.DataPipeline -P string_param:string=new_value,int_param:int=10,float_param:float=0.1234,bool_param:bool=true

# pipeline with custom params (object, array and secureString)
fab:/$ job run ws1.Workspace/pip1.DataPipeline -P obj_param:object={"key":{"key":2},"key":"value"},array_param:array=[1,2,3],secstr_param:secureString=secret

Running Notebooks

Run synchronous or asynchronous notebook jobs with optional config -C, parameters -P, or input -i. -i cannot be used with either -C or -P.

# async passing configuration using JSON file
fab:/$ job start ws1.Workspace/nb1.Notebook -C ./config_file.json

# async passing configuration using inline JSON
fab:/$ job start ws1.Workspace/nb1.Notebook -C { "conf":{"spark.conf1": "value"}, "environment": { "id": "<environment_id>", "name": "<environment_name>" } }

# sync passing configuration using inline JSON
fab:/$ job run ws1.Workspace/nb1.Notebook -C { "defaultLakehouse": { "name": "<lakehouse-name>", "id": "<lakehouse-id>", "workspaceId": "<(optional) workspace-id-that-contains-the-lakehouse>" }}

# async passing configuration using inline JSON
fab:/$ job start ws1.Workspace/nb1.Notebook -C { "defaultLakehouse": { "name": "<lakehouse-name>", "id": "<lakehouse-id>" }, "useStarterPool": false, "useWorkspacePool": "<workspace-pool-name>" }

# sync passing inline configuration and parameters
fab:/$ job run ws1.Workspace/nb1.Notebook -P string_param:string=new_value -C {"environment": { "id": "<environment_id>", "name": "<environment_name>" }}

# sync using inline JSON (raw)
fab:/$ job run ws1.Workspace/nb1.Notebook -i {"parameters": {"string_param": {"type": "string", "value": "new_value"}}, "configuration": {"conf":{"spark.conf1": "value"}}}

# sync using JSON file (raw)
fab:/$ job run ws1.Workspace/nb1.Notebook -i ./input_file.json

Explore notebook run payload options.

Running Pipelines

Run synchronous or asynchronous pipeline jobs with optional parameters -P, or input -i. -i cannot be used with -P.

# async using inline JSON (raw)
$ job start ws1.Workspace/pip1.DataPipeline -i {"parameters": {"string_param": "new_value", "int_param": 10}}

# sync using inline JSON (raw)
$ job run ws1.Workspace/pip1.DataPipeline -i {"parameters": {"float_param": 0.1234, "bool_param": true, "obj_param": {"key": "value"}, "array_param": [1, 2, 3], "secstr_param": "secret"}}

Explore pipeline run payload options.

Running Spark Job Definitions

Run synchronous or asynchronous Spark Job Definition jobs with input -i.

# sync SJD job run
$ job run ws1.Workspace/sjd1.SparkJobDefinition
# async SJD job run with payload definition
$ job start ws1.Workspace/sjd1.SparkJobDefinition -i { "commandLineArguments": "param01 TEST param02 1234", "environmentArtifactId": "<environment-id>", "defaultLakehouseArtifactId": "<lakehouse-id>", "additionalLakehouseIds": ["<lakehouse-id>"] }

Running Table Maintenance jobs

Run synchronous or asynchronous table jobs with input -i.

# run vorder, zorder and vacuum
$ job run ws1.Workspace/lh1.Lakehouse -i {'tableName': 'orders', 'optimizeSettings': {'vOrder': true, 'zOrderBy': ['account_id']}, 'vacuumSettings': {'retentionPeriod': '7.01:00:00'}}

# run vorderand zorder
$ job run ws1.Workspace/lh1.Lakehouse -i {'tableName': 'orders', 'optimizeSettings': {'vOrder': true, 'zOrderBy': ['account_id']}}

# run vacuum
$ job run ws1.Workspace/lh1.Lakehouse -i {'tableName': 'orders', 'vacuumSettings': {'retentionPeriod': '7.01:00:00'}}

Note: see Managing Tables for friendly options (recommended).

Listing job runs

List job runs for an item.

Supported items:

fab:/$ job run-list ws1.Workspace/nb1.Notebook

Getting job instance status

Get job instance status for an item with --id.

fab:/$ job run-status ws1.Workspace/nb1.Notebook --id 3cf84ce6-3706-4017-a68f-e26b9ca3238c

Cancelling a job instance

Cancel job instance for an item with --id.

fab:/$ job run-cancel ws1.Workspace/nb1.Notebook --id 3cf84ce6-3706-4017-a68f-e26b9ca32300

Scheduling a job

Schedule Job for Item (Default: disabled).

# run every 10 minutes and enable it
fab:/$ job run-sch pip1.DataPipeline --type cron --interval 10 --start 2024-11-15T09:00:00 --end 2024-12-15T10:00:00 --enable

# run every day at 10:00 and 16:00 (disabled by default)
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --type daily --interval 10:00,16:00 --start 2024-11-15T09:00:00 --end 2024-12-16T10:00:00

# run every Monday and Friday at 10:00 and 16:00, disabled by default
fab:/$ job run-sch pip1.DataPipeline --type weekly --interval 10:00,16:00 --days Monday,Friday --start 2024-11-15T09:00:00 --end 2024-12-16T10:00:00

# set up pipeline schedule with custom input
fab:/$ job run-sch pip1.DataPipeline -i {'enabled': true, 'configuration': {'startDateTime': '2024-04-28T00:00:00', 'endDateTime': '2024-04-30T23:59:00', 'localTimeZoneId': 'Central Standard Time', 'type': 'Cron', 'interval': 10}}

Updating a job schedule

Schedule a job for an item.

# disable pipeline schedule
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --disable

# update pipeline schedule to run every 10 minutes and enable it
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --type cron --interval 10 --start 2024-11-15T09:00:00 --end 2024-12-15T10:00:00 --enable

$ update pipeline schedule to run every day at 10:00 and 16:00 (maintain the existing enabled state)
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --type daily --interval 10:00,16:00 --start 2024-11-15T09:00:00 --end 2024-12-16T10:00:00

# update pipeline schedule to run every Monday and Friday at 10:00 and 16:00 and enable it
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --type weekly --interval 10:00,16:00 --days Monday,Friday --start 2024-11-15T09:00:00 --end 2024-12-16T10:00:00 --enable

# update pipeline schedule with custom input
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> -i {'enabled': true, 'configuration': {'startDateTime': '2024-04-28T00:00:00', 'endDateTime': '2024-04-30T23:59:00', 'localTimeZoneId': 'Central Standard Time', 'type': 'Cron', 'interval': 10}}

Listing schedule job runs

Listing scheduled job runs with --schedule flag.

fab:/$ job run-list ws1.Workspace/nb1.Notebook --schedule

Gettting job schedule status

Get the status of a schedule job with schedule id --id and --schedule flag.

fab:/$ job run-status ws1.Workspace/nb1.Notebook --id 2cf34ce6-3706-4347-a68f-e26b9ca3567n --schedule

Disable a job schedule

For disabling a Job schedule, issue a run-update command with schedule id --id and --disable flag.

# disable pipeline schedule
fab:/$ job run-update pip1.DataPipeline --id <schedule_id> --disable

See all examples