< Back
Add-DatabricksJarJob
Post
NAME Add-DatabricksJarJob
SYNOPSIS
Creates Jar Job in Databricks. Script uses Databricks API 2.0 create job query:
https://docs.azuredatabricks.net/api/la ... tml#create
SYNTAX
Add-DatabricksJarJob [[-BearerToken] <String>] [[-Region] <String>] [-JobName] <String> [[-ClusterId] <String>]
[[-SparkVersion] <String>] [[-NodeType] <String>] [[-DriverNodeType] <String>] [[-MinNumberOfWorkers] <Int32>]
[[-MaxNumberOfWorkers] <Int32>] [[-Timeout] <Int32>] [[-MaxRetries] <Int32>] [[-ScheduleCronExpression] <String>]
[[-Timezone] <String>] [-JarPath] <String> [-JarMainClass] <String> [[-JarParameters] <String[]>] [[-Libraries]
<String[]>] [[-Spark_conf] <Hashtable>] [[-CustomTags] <Hashtable>] [[-InitScripts] <String[]>] [[-SparkEnvVars]
<Hashtable>] [[-ClusterLogPath] <String>] [[-InstancePoolId] <String>] [<CommonParameters>]
DESCRIPTION
Creates Jar Job in Databricks. Script uses Databricks API 2.0 create job query:
https://docs.azuredatabricks.net/api/la ... tml#create
If the job name exists it will be updated instead of creating a new job.
PARAMETERS
-BearerToken <String>
Your Databricks Bearer token to authenticate to your workspace (see User Settings in Datatbricks WebUI)
Required? false
Position? 1
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Region <String>
Azure Region - must match the URL of your Databricks workspace, example: northeurope
Required? false
Position? 2
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobName <String>
Name of the job that will appear in the Job list. If a job with this name exists
it will be updated.
Required? true
Position? 3
Default value
Accept pipeline input? false
Accept wildcard characters? false
-ClusterId <String>
The ClusterId of an existing cluster to use. Optional.
Required? false
Position? 4
Default value
Accept pipeline input? false
Accept wildcard characters? false
-SparkVersion <String>
Spark version for cluster that will run the job. Example: 5.3.x-scala2.11
Note: Ignored if ClusterId is populated.
Required? false
Position? 5
Default value
Accept pipeline input? false
Accept wildcard characters? false
-NodeType <String>
Type of worker for cluster that will run the job. Example: Standard_D3_v2.
Note: Ignored if ClusterId is populated.
Required? false
Position? 6
Default value
Accept pipeline input? false
Accept wildcard characters? false
-DriverNodeType <String>
Type of driver for cluster that will run the job. Example: Standard_D3_v2.
If not provided the NodeType will be used.
Note: Ignored if ClusterId is populated.
Required? false
Position? 7
Default value
Accept pipeline input? false
Accept wildcard characters? false
-MinNumberOfWorkers <Int32>
Number of workers for cluster that will run the job.
Note: If Min & Max Workers are the same autoscale is disabled.
Note: Ignored if ClusterId is populated.
Required? false
Position? 8
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-MaxNumberOfWorkers <Int32>
Number of workers for cluster that will run the job.
Note: If Min & Max Workers are the same autoscale is disabled.
Note: Ignored if ClusterId is populated.
Required? false
Position? 9
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-Timeout <Int32>
Timeout, in seconds, applied to each run of the job. If not set, there will be no timeout.
Required? false
Position? 10
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-MaxRetries <Int32>
An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it
completes with a FAILED result_state or INTERNAL_ERROR life_cycle_state. The value -1 means to retry
indefinitely and the value 0 means to never retry. If not set, the default behavior will be never retry.
Required? false
Position? 11
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-ScheduleCronExpression <String>
By default, job will run when triggered using Jobs UI or sending API request to run. You can provide cron
schedule expression for job's periodic run. How to compose cron schedule expression:
http://www.quartz-scheduler.org/documen ... on-06.html
Required? false
Position? 12
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Timezone <String>
Timezone for Cron Schedule Expression. Required if ScheduleCronExpression provided. See here for all possible
timezones: http://joda-time.sourceforge.net/timezones.html
Example: UTC
Required? false
Position? 13
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JarPath <String>
Path to the Jar in Databricks that will be executed by this Job. Path is relative to dbfs:/FileStore/job-jars
Required? true
Position? 14
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JarMainClass <String>
Class within Jar to execute. Example "org.apache.spark.examples.SparkPi"
Required? true
Position? 15
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JarParameters <String[]>
Optional parameters that will be provided to script when Job is executed. Example: "val1", "val2"
Required? false
Position? 16
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Libraries <String[]>
Optional. Array of json strings. Example: '{"pypi":{package:"simplejson"}}', '{"jar",
"DBFS:/mylibraries/test.jar"}'
Required? false
Position? 17
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Spark_conf <Hashtable>
Hashtable.
Example @{"spark.speculation"=$true; "spark.streaming.ui.retainedBatches"= 5}
Required? false
Position? 18
Default value
Accept pipeline input? false
Accept wildcard characters? false
-CustomTags <Hashtable>
Custom Tags to set, provide hash table of tags. Example: @{CreatedBy="SimonDM";NumOfNodes=2;CanDelete=$true}
Required? false
Position? 19
Default value
Accept pipeline input? false
Accept wildcard characters? false
-InitScripts <String[]>
Init scripts to run post creation. Example: "dbfs:/script/script1", "dbfs:/script/script2"
Required? false
Position? 20
Default value
Accept pipeline input? false
Accept wildcard characters? false
-SparkEnvVars <Hashtable>
An object containing a set of optional, user-specified environment variable key-value pairs. Key-value pairs
of the form (X,Y) are exported as is (i.e., export X='Y') while launching the driver and workers.
Example: '@{SPARK_WORKER_MEMORY="29000m";SPARK_LOCAL_DIRS="/local_disk0"}
Required? false
Position? 21
Default value
Accept pipeline input? false
Accept wildcard characters? false
-ClusterLogPath <String>
DBFS Location for Cluster logs - must start with dbfs:/
Example dbfs:/logs/mycluster
Required? false
Position? 22
Default value
Accept pipeline input? false
Accept wildcard characters? false
-InstancePoolId <String>
Required? false
Position? 23
Default value
Accept pipeline input? false
Accept wildcard characters? false
<CommonParameters>
This cmdlet supports the common parameters: Verbose, Debug,
ErrorAction, ErrorVariable, WarningAction, WarningVariable,
OutBuffer, PipelineVariable, and OutVariable. For more information, see
about_CommonParameters (https:/go.microsoft.com/fwlink/?LinkID=113216).
INPUTS
OUTPUTS
NOTES
Author: Simon D'Morias / Data Thirst Ltd
-------------------------- EXAMPLE 1 --------------------------
PS C:\\>Add-DatabricksJarJob -BearerToken $BearerToken -Region $Region -JobName "Job1" -SparkVersion
"5.3.x-scala2.11" -NodeType "Standard_D3_v2" -MinNumberOfWorkers 2 -MaxNumberOfWorkers 2 -Timeout 100 -MaxRetries
3 -ScheduleCronExpression "0 15 22 ? * *" -Timezone "UTC" -JarPath "folder/Test.jar" -JarMainClass 'com.test.me'
-JarParameters "val1", "val2" -Libraries '{"jar": "DBFS:/mylibraries/test.jar"}'
The above example create a job on a new cluster.
RELATED LINKS
SYNOPSIS
Creates Jar Job in Databricks. Script uses Databricks API 2.0 create job query:
https://docs.azuredatabricks.net/api/la ... tml#create
SYNTAX
Add-DatabricksJarJob [[-BearerToken] <String>] [[-Region] <String>] [-JobName] <String> [[-ClusterId] <String>]
[[-SparkVersion] <String>] [[-NodeType] <String>] [[-DriverNodeType] <String>] [[-MinNumberOfWorkers] <Int32>]
[[-MaxNumberOfWorkers] <Int32>] [[-Timeout] <Int32>] [[-MaxRetries] <Int32>] [[-ScheduleCronExpression] <String>]
[[-Timezone] <String>] [-JarPath] <String> [-JarMainClass] <String> [[-JarParameters] <String[]>] [[-Libraries]
<String[]>] [[-Spark_conf] <Hashtable>] [[-CustomTags] <Hashtable>] [[-InitScripts] <String[]>] [[-SparkEnvVars]
<Hashtable>] [[-ClusterLogPath] <String>] [[-InstancePoolId] <String>] [<CommonParameters>]
DESCRIPTION
Creates Jar Job in Databricks. Script uses Databricks API 2.0 create job query:
https://docs.azuredatabricks.net/api/la ... tml#create
If the job name exists it will be updated instead of creating a new job.
PARAMETERS
-BearerToken <String>
Your Databricks Bearer token to authenticate to your workspace (see User Settings in Datatbricks WebUI)
Required? false
Position? 1
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Region <String>
Azure Region - must match the URL of your Databricks workspace, example: northeurope
Required? false
Position? 2
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobName <String>
Name of the job that will appear in the Job list. If a job with this name exists
it will be updated.
Required? true
Position? 3
Default value
Accept pipeline input? false
Accept wildcard characters? false
-ClusterId <String>
The ClusterId of an existing cluster to use. Optional.
Required? false
Position? 4
Default value
Accept pipeline input? false
Accept wildcard characters? false
-SparkVersion <String>
Spark version for cluster that will run the job. Example: 5.3.x-scala2.11
Note: Ignored if ClusterId is populated.
Required? false
Position? 5
Default value
Accept pipeline input? false
Accept wildcard characters? false
-NodeType <String>
Type of worker for cluster that will run the job. Example: Standard_D3_v2.
Note: Ignored if ClusterId is populated.
Required? false
Position? 6
Default value
Accept pipeline input? false
Accept wildcard characters? false
-DriverNodeType <String>
Type of driver for cluster that will run the job. Example: Standard_D3_v2.
If not provided the NodeType will be used.
Note: Ignored if ClusterId is populated.
Required? false
Position? 7
Default value
Accept pipeline input? false
Accept wildcard characters? false
-MinNumberOfWorkers <Int32>
Number of workers for cluster that will run the job.
Note: If Min & Max Workers are the same autoscale is disabled.
Note: Ignored if ClusterId is populated.
Required? false
Position? 8
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-MaxNumberOfWorkers <Int32>
Number of workers for cluster that will run the job.
Note: If Min & Max Workers are the same autoscale is disabled.
Note: Ignored if ClusterId is populated.
Required? false
Position? 9
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-Timeout <Int32>
Timeout, in seconds, applied to each run of the job. If not set, there will be no timeout.
Required? false
Position? 10
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-MaxRetries <Int32>
An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it
completes with a FAILED result_state or INTERNAL_ERROR life_cycle_state. The value -1 means to retry
indefinitely and the value 0 means to never retry. If not set, the default behavior will be never retry.
Required? false
Position? 11
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-ScheduleCronExpression <String>
By default, job will run when triggered using Jobs UI or sending API request to run. You can provide cron
schedule expression for job's periodic run. How to compose cron schedule expression:
http://www.quartz-scheduler.org/documen ... on-06.html
Required? false
Position? 12
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Timezone <String>
Timezone for Cron Schedule Expression. Required if ScheduleCronExpression provided. See here for all possible
timezones: http://joda-time.sourceforge.net/timezones.html
Example: UTC
Required? false
Position? 13
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JarPath <String>
Path to the Jar in Databricks that will be executed by this Job. Path is relative to dbfs:/FileStore/job-jars
Required? true
Position? 14
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JarMainClass <String>
Class within Jar to execute. Example "org.apache.spark.examples.SparkPi"
Required? true
Position? 15
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JarParameters <String[]>
Optional parameters that will be provided to script when Job is executed. Example: "val1", "val2"
Required? false
Position? 16
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Libraries <String[]>
Optional. Array of json strings. Example: '{"pypi":{package:"simplejson"}}', '{"jar",
"DBFS:/mylibraries/test.jar"}'
Required? false
Position? 17
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Spark_conf <Hashtable>
Hashtable.
Example @{"spark.speculation"=$true; "spark.streaming.ui.retainedBatches"= 5}
Required? false
Position? 18
Default value
Accept pipeline input? false
Accept wildcard characters? false
-CustomTags <Hashtable>
Custom Tags to set, provide hash table of tags. Example: @{CreatedBy="SimonDM";NumOfNodes=2;CanDelete=$true}
Required? false
Position? 19
Default value
Accept pipeline input? false
Accept wildcard characters? false
-InitScripts <String[]>
Init scripts to run post creation. Example: "dbfs:/script/script1", "dbfs:/script/script2"
Required? false
Position? 20
Default value
Accept pipeline input? false
Accept wildcard characters? false
-SparkEnvVars <Hashtable>
An object containing a set of optional, user-specified environment variable key-value pairs. Key-value pairs
of the form (X,Y) are exported as is (i.e., export X='Y') while launching the driver and workers.
Example: '@{SPARK_WORKER_MEMORY="29000m";SPARK_LOCAL_DIRS="/local_disk0"}
Required? false
Position? 21
Default value
Accept pipeline input? false
Accept wildcard characters? false
-ClusterLogPath <String>
DBFS Location for Cluster logs - must start with dbfs:/
Example dbfs:/logs/mycluster
Required? false
Position? 22
Default value
Accept pipeline input? false
Accept wildcard characters? false
-InstancePoolId <String>
Required? false
Position? 23
Default value
Accept pipeline input? false
Accept wildcard characters? false
<CommonParameters>
This cmdlet supports the common parameters: Verbose, Debug,
ErrorAction, ErrorVariable, WarningAction, WarningVariable,
OutBuffer, PipelineVariable, and OutVariable. For more information, see
about_CommonParameters (https:/go.microsoft.com/fwlink/?LinkID=113216).
INPUTS
OUTPUTS
NOTES
Author: Simon D'Morias / Data Thirst Ltd
-------------------------- EXAMPLE 1 --------------------------
PS C:\\>Add-DatabricksJarJob -BearerToken $BearerToken -Region $Region -JobName "Job1" -SparkVersion
"5.3.x-scala2.11" -NodeType "Standard_D3_v2" -MinNumberOfWorkers 2 -MaxNumberOfWorkers 2 -Timeout 100 -MaxRetries
3 -ScheduleCronExpression "0 15 22 ? * *" -Timezone "UTC" -JarPath "folder/Test.jar" -JarMainClass 'com.test.me'
-JarParameters "val1", "val2" -Libraries '{"jar": "DBFS:/mylibraries/test.jar"}'
The above example create a job on a new cluster.
RELATED LINKS