< Back
New-AzureDatabricksJob
Post
NAME New-AzureDatabricksJob
SYNOPSIS
Dynamically create a job on an Azure Databricks cluster. Returns an object defining the job and the newly assigned
job ID number.
SYNTAX
New-AzureDatabricksJob -Connection <Object> -JobName <String> -JobType <String> -NotebookPath <String>
[-JobParameters <Hashtable>] [-JobLibraries <Hashtable>] [-UseExistingCluster <String>] [-NodeType <String>]
[-NumWorkers <Int32>] [-SparkVersion <String>] [<CommonParameters>]
DESCRIPTION
You can use this function to create a new defined job on your Azure Databricks cluster. Currently only supports
Notebook-based jobs. You can also dynamically pass in
libraries to use in the job, as well as pre-defined parameters. Other non-requied options allow you to change the
cluster node and driver types as well as total number
of worker nodes (or to use an existing defined cluster).
PARAMETERS
-Connection <Object>
An object that represents an Azure Databricks API connection where you want to create your job.
Required? true
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobName <String>
The name of the new job.
Required? true
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobType <String>
The type of job to run. Currently only supports "Notebook" job types.
Required? true
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-NotebookPath <String>
The path on your Azure Databricks instance where your job's notebook resides.
Required? true
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobParameters <Hashtable>
What parameters you should pass into your notebook. Should be a hashtable (see notes).
Required? false
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobLibraries <Hashtable>
Required? false
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-UseExistingCluster <String>
If you want this job to use a predefied Azure Databtricks cluster, specify a named cluster here.
Required? false
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-NodeType <String>
For dynamic job clusters, what is the node type you want to use (defaults to: Standard_DS3_v2)
Required? false
Position? named
Default value Standard_DS3_v2
Accept pipeline input? false
Accept wildcard characters? false
-NumWorkers <Int32>
For dynamic job clusters, what is our max number of workers? (defaults to: 4)
Required? false
Position? named
Default value 4
Accept pipeline input? false
Accept wildcard characters? false
-SparkVersion <String>
What version of Spark should the dynamic cluster use? (defaults to: 4.2.x-scala2.11)
Required? false
Position? named
Default value 4.2.x-scala2.11
Accept pipeline input? false
Accept wildcard characters? false
<CommonParameters>
This cmdlet supports the common parameters: Verbose, Debug,
ErrorAction, ErrorVariable, WarningAction, WarningVariable,
OutBuffer, PipelineVariable, and OutVariable. For more information, see
about_CommonParameters (https:/go.microsoft.com/fwlink/?LinkID=113216).
INPUTS
OUTPUTS
NOTES
A sample of the hashtables needed for this function:
$JobLibraries = @{
'pypi' = 'simplejson=3.8.0'
}
Each line of your hashtable should be either of type pypi or egg. If egg, specify the path to the egg.
$Parameters = @{
'Param1' = 'X'
'Param2' = 2
}
Each line of your hashtable should a key/value pair of the name of the paramter in your notebook and the value
you want to pass in.
Author: Drew Furgiuele (@pittfurg), http://www.port1433.com
Website: https://www.igs.com
Copyright: (c) 2019 by IGS, licensed under MIT
License: MIT https://opensource.org/licenses/MIT
-------------------------- EXAMPLE 1 --------------------------
PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath
"/Users/Drew/SomeNotebook" -UseExistingCluster "DrewsCluster"
Defines a new job called "New Job" to runs the notebook "SomeNotebook" on the existing cluster "DrewsCluster"
-------------------------- EXAMPLE 2 --------------------------
PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath
"/Users/Drew/SomeNotebook" -UseExistingCluster "DrewsCluster" -JobParameters $Parameters
Defines a new job called "New Job" to runs the notebook "SomeNotebook" on the existing cluster "DrewsCluster" and
will use the paremeters in the hashtable $Parameters to pass to the notebook when it runs.
-------------------------- EXAMPLE 3 --------------------------
PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath
"/Users/Drew/SomeNotebook"
Defines a new job called "New Job" to runs the notebook "SomeNotebook" as a new cluster with the default node
type, number of works, and Spark version.
RELATED LINKS
SYNOPSIS
Dynamically create a job on an Azure Databricks cluster. Returns an object defining the job and the newly assigned
job ID number.
SYNTAX
New-AzureDatabricksJob -Connection <Object> -JobName <String> -JobType <String> -NotebookPath <String>
[-JobParameters <Hashtable>] [-JobLibraries <Hashtable>] [-UseExistingCluster <String>] [-NodeType <String>]
[-NumWorkers <Int32>] [-SparkVersion <String>] [<CommonParameters>]
DESCRIPTION
You can use this function to create a new defined job on your Azure Databricks cluster. Currently only supports
Notebook-based jobs. You can also dynamically pass in
libraries to use in the job, as well as pre-defined parameters. Other non-requied options allow you to change the
cluster node and driver types as well as total number
of worker nodes (or to use an existing defined cluster).
PARAMETERS
-Connection <Object>
An object that represents an Azure Databricks API connection where you want to create your job.
Required? true
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobName <String>
The name of the new job.
Required? true
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobType <String>
The type of job to run. Currently only supports "Notebook" job types.
Required? true
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-NotebookPath <String>
The path on your Azure Databricks instance where your job's notebook resides.
Required? true
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobParameters <Hashtable>
What parameters you should pass into your notebook. Should be a hashtable (see notes).
Required? false
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-JobLibraries <Hashtable>
Required? false
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-UseExistingCluster <String>
If you want this job to use a predefied Azure Databtricks cluster, specify a named cluster here.
Required? false
Position? named
Default value
Accept pipeline input? false
Accept wildcard characters? false
-NodeType <String>
For dynamic job clusters, what is the node type you want to use (defaults to: Standard_DS3_v2)
Required? false
Position? named
Default value Standard_DS3_v2
Accept pipeline input? false
Accept wildcard characters? false
-NumWorkers <Int32>
For dynamic job clusters, what is our max number of workers? (defaults to: 4)
Required? false
Position? named
Default value 4
Accept pipeline input? false
Accept wildcard characters? false
-SparkVersion <String>
What version of Spark should the dynamic cluster use? (defaults to: 4.2.x-scala2.11)
Required? false
Position? named
Default value 4.2.x-scala2.11
Accept pipeline input? false
Accept wildcard characters? false
<CommonParameters>
This cmdlet supports the common parameters: Verbose, Debug,
ErrorAction, ErrorVariable, WarningAction, WarningVariable,
OutBuffer, PipelineVariable, and OutVariable. For more information, see
about_CommonParameters (https:/go.microsoft.com/fwlink/?LinkID=113216).
INPUTS
OUTPUTS
NOTES
A sample of the hashtables needed for this function:
$JobLibraries = @{
'pypi' = 'simplejson=3.8.0'
}
Each line of your hashtable should be either of type pypi or egg. If egg, specify the path to the egg.
$Parameters = @{
'Param1' = 'X'
'Param2' = 2
}
Each line of your hashtable should a key/value pair of the name of the paramter in your notebook and the value
you want to pass in.
Author: Drew Furgiuele (@pittfurg), http://www.port1433.com
Website: https://www.igs.com
Copyright: (c) 2019 by IGS, licensed under MIT
License: MIT https://opensource.org/licenses/MIT
-------------------------- EXAMPLE 1 --------------------------
PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath
"/Users/Drew/SomeNotebook" -UseExistingCluster "DrewsCluster"
Defines a new job called "New Job" to runs the notebook "SomeNotebook" on the existing cluster "DrewsCluster"
-------------------------- EXAMPLE 2 --------------------------
PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath
"/Users/Drew/SomeNotebook" -UseExistingCluster "DrewsCluster" -JobParameters $Parameters
Defines a new job called "New Job" to runs the notebook "SomeNotebook" on the existing cluster "DrewsCluster" and
will use the paremeters in the hashtable $Parameters to pass to the notebook when it runs.
-------------------------- EXAMPLE 3 --------------------------
PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath
"/Users/Drew/SomeNotebook"
Defines a new job called "New Job" to runs the notebook "SomeNotebook" as a new cluster with the default node
type, number of works, and Spark version.
RELATED LINKS