Post

Sun Jan 19, 2020 6:07 pm

NAME New-AzureDatabricksJob

SYNOPSIS

Dynamically create a job on an Azure Databricks cluster. Returns an object defining the job and the newly assigned

job ID number.

SYNTAX

New-AzureDatabricksJob -Connection <Object> -JobName <String> -JobType <String> -NotebookPath <String>

[-JobParameters <Hashtable>] [-JobLibraries <Hashtable>] [-UseExistingCluster <String>] [-NodeType <String>]

[-NumWorkers <Int32>] [-SparkVersion <String>] [<CommonParameters>]

DESCRIPTION

You can use this function to create a new defined job on your Azure Databricks cluster. Currently only supports

Notebook-based jobs. You can also dynamically pass in

libraries to use in the job, as well as pre-defined parameters. Other non-requied options allow you to change the

cluster node and driver types as well as total number

of worker nodes (or to use an existing defined cluster).

PARAMETERS

-Connection <Object>

An object that represents an Azure Databricks API connection where you want to create your job.

Required? true

Position? named

Default value

Accept pipeline input? false

Accept wildcard characters? false

-JobName <String>

The name of the new job.

Required? true

Position? named

Default value

Accept pipeline input? false

Accept wildcard characters? false

-JobType <String>

The type of job to run. Currently only supports "Notebook" job types.

Required? true

Position? named

Default value

Accept pipeline input? false

Accept wildcard characters? false

-NotebookPath <String>

The path on your Azure Databricks instance where your job's notebook resides.

Required? true

Position? named

Default value

Accept pipeline input? false

Accept wildcard characters? false

-JobParameters <Hashtable>

What parameters you should pass into your notebook. Should be a hashtable (see notes).

Required? false

Position? named

Default value

Accept pipeline input? false

Accept wildcard characters? false

-JobLibraries <Hashtable>

Required? false

Position? named

Default value

Accept pipeline input? false

Accept wildcard characters? false

-UseExistingCluster <String>

If you want this job to use a predefied Azure Databtricks cluster, specify a named cluster here.

Required? false

Position? named

Default value

Accept pipeline input? false

Accept wildcard characters? false

-NodeType <String>

For dynamic job clusters, what is the node type you want to use (defaults to: Standard_DS3_v2)

Required? false

Position? named

Default value Standard_DS3_v2

Accept pipeline input? false

Accept wildcard characters? false

-NumWorkers <Int32>

For dynamic job clusters, what is our max number of workers? (defaults to: 4)

Required? false

Position? named

Default value 4

Accept pipeline input? false

Accept wildcard characters? false

-SparkVersion <String>

What version of Spark should the dynamic cluster use? (defaults to: 4.2.x-scala2.11)

Required? false

Position? named

Default value 4.2.x-scala2.11

Accept pipeline input? false

Accept wildcard characters? false

<CommonParameters>

This cmdlet supports the common parameters: Verbose, Debug,

ErrorAction, ErrorVariable, WarningAction, WarningVariable,

OutBuffer, PipelineVariable, and OutVariable. For more information, see

about_CommonParameters (https:/go.microsoft.com/fwlink/?LinkID=113216).

INPUTS

OUTPUTS

NOTES

A sample of the hashtables needed for this function:

$JobLibraries = @{

'pypi' = 'simplejson=3.8.0'

}

Each line of your hashtable should be either of type pypi or egg. If egg, specify the path to the egg.

$Parameters = @{

'Param1' = 'X'

'Param2' = 2

}

Each line of your hashtable should a key/value pair of the name of the paramter in your notebook and the value

you want to pass in.

Author: Drew Furgiuele (@pittfurg), http://www.port1433.com

Website: https://www.igs.com

Copyright: (c) 2019 by IGS, licensed under MIT

License: MIT https://opensource.org/licenses/MIT

-------------------------- EXAMPLE 1 --------------------------

PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath

"/Users/Drew/SomeNotebook" -UseExistingCluster "DrewsCluster"

Defines a new job called "New Job" to runs the notebook "SomeNotebook" on the existing cluster "DrewsCluster"

-------------------------- EXAMPLE 2 --------------------------

PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath

"/Users/Drew/SomeNotebook" -UseExistingCluster "DrewsCluster" -JobParameters $Parameters

Defines a new job called "New Job" to runs the notebook "SomeNotebook" on the existing cluster "DrewsCluster" and

will use the paremeters in the hashtable $Parameters to pass to the notebook when it runs.

-------------------------- EXAMPLE 3 --------------------------

PS C:\\>New-AzureDatabricksJob -Connection $Connection -JobName "New Job" -JobType Notebook -NotebookPath

"/Users/Drew/SomeNotebook"

Defines a new job called "New Job" to runs the notebook "SomeNotebook" as a new cluster with the default node

type, number of works, and Spark version.

RELATED LINKS

Post a reply in the forum

New-AzureDatabricksJob

Share this page: