< Back
New-DatabricksCluster
Post
NAME New-DatabricksCluster
SYNOPSIS
Creates a new Databricks cluster
SYNTAX
New-DatabricksCluster [[-BearerToken] <String>] [[-Region] <String>] [-ClusterName] <String> [[-SparkVersion]
<String>] [[-NodeType] <String>] [[-DriverNodeType] <String>] [-MinNumberOfWorkers] <Int32> [-MaxNumberOfWorkers]
<Int32> [[-AutoTerminationMinutes] <Int32>] [[-Spark_conf] <Hashtable>] [[-CustomTags] <Hashtable>]
[[-InitScripts] <String[]>] [[-SparkEnvVars] <Hashtable>] [-UniqueNames] [-Update] [[-PythonVersion] <String>]
[[-ClusterLogPath] <String>] [[-InstancePoolId] <String>] [<CommonParameters>]
DESCRIPTION
Creates a new cluster
PARAMETERS
-BearerToken <String>
Your Databricks Bearer token to authenticate to your workspace (see User Settings in Databricks WebUI)
Required? false
Position? 1
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Region <String>
Azure Region - must match the URL of your Databricks workspace, example northeurope
Required? false
Position? 2
Default value
Accept pipeline input? false
Accept wildcard characters? false
-ClusterName <String>
Required? true
Position? 3
Default value
Accept pipeline input? false
Accept wildcard characters? false
-SparkVersion <String>
Spark version for cluster. Example: 5.3.x-scala2.11
See Get-DatabricksSparkVersions
Required? false
Position? 4
Default value
Accept pipeline input? false
Accept wildcard characters? false
-NodeType <String>
Type of worker for cluster. Example: Standard_D3_v2
See Get-DatabricksNodeTypes
Required? false
Position? 5
Default value
Accept pipeline input? false
Accept wildcard characters? false
-DriverNodeType <String>
Type of Driver for cluster. Example: Standard_D3_v2. If not set it will default to $NodeType
See Get-DatabricksNodeTypes
Required? false
Position? 6
Default value
Accept pipeline input? false
Accept wildcard characters? false
-MinNumberOfWorkers <Int32>
Min number of workers for cluster that will run the job. If the same as $MaxNumberOfWorkers autoscale is
disabled.
Required? true
Position? 7
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-MaxNumberOfWorkers <Int32>
Max number of workers for cluster that will run the job. If the same as $MinNumberOfWorkers autoscale is
disabled.
Required? true
Position? 8
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-AutoTerminationMinutes <Int32>
Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster
will not be automatically terminated.
If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to
explicitly disable automatic termination.
Required? false
Position? 9
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-Spark_conf <Hashtable>
Hashtable.
Example @{"spark.speculation"=$true; "spark.streaming.ui.retainedBatches"= 5}
Required? false
Position? 10
Default value
Accept pipeline input? false
Accept wildcard characters? false
-CustomTags <Hashtable>
Custom Tags to set, provide hash table of tags. Example: @{CreatedBy="SimonDM";NumOfNodes=2;CanDelete=$true}
Required? false
Position? 11
Default value
Accept pipeline input? false
Accept wildcard characters? false
-InitScripts <String[]>
Init scripts to run post creation. As array of strings - paths must be full dbfs paths. Example:
"dbfs:/script/script1", "dbfs:/script/script2"
Required? false
Position? 12
Default value
Accept pipeline input? false
Accept wildcard characters? false
-SparkEnvVars <Hashtable>
A hashtable containing a set of optional, user-specified environment variable key-value pairs. Key-value pairs
of the form (X,Y) are exported as is (i.e., export X='Y') while launching the driver and workers.
Example: @{SPARK_WORKER_MEMORY="29000m";SPARK_LOCAL_DIRS="/local_disk0"}
Required? false
Position? 13
Default value
Accept pipeline input? false
Accept wildcard characters? false
-UniqueNames [<SwitchParameter>]
Switch. By default Databricks allows duplicate cluster names. By setting this switch a check will be completed
to see if this cluster exists.
If it does exist an error will be thrown making the script idempotent. Defaults to False.
Required? false
Position? named
Default value False
Accept pipeline input? false
Accept wildcard characters? false
-Update [<SwitchParameter>]
Switch. If the cluster name exist then update the configuration to this one. Defaults to False.
Required? false
Position? named
Default value False
Accept pipeline input? false
Accept wildcard characters? false
-PythonVersion <String>
2 or 3 - defaults to 3.
Required? false
Position? 14
Default value 3
Accept pipeline input? false
Accept wildcard characters? false
-ClusterLogPath <String>
DBFS Location for Cluster logs - must start with dbfs:/
Example dbfs:/logs/mycluster
Required? false
Position? 15
Default value
Accept pipeline input? false
Accept wildcard characters? false
-InstancePoolId <String>
If you would liek to use nodes from an instance pool set the pool id
https://docs.azuredatabricks.net/user-g ... ance-pools
Required? false
Position? 16
Default value
Accept pipeline input? false
Accept wildcard characters? false
<CommonParameters>
This cmdlet supports the common parameters: Verbose, Debug,
ErrorAction, ErrorVariable, WarningAction, WarningVariable,
OutBuffer, PipelineVariable, and OutVariable. For more information, see
about_CommonParameters (https:/go.microsoft.com/fwlink/?LinkID=113216).
INPUTS
OUTPUTS
NOTES
Author: Simon D'Morias / Data Thirst Ltd
RELATED LINKS
SYNOPSIS
Creates a new Databricks cluster
SYNTAX
New-DatabricksCluster [[-BearerToken] <String>] [[-Region] <String>] [-ClusterName] <String> [[-SparkVersion]
<String>] [[-NodeType] <String>] [[-DriverNodeType] <String>] [-MinNumberOfWorkers] <Int32> [-MaxNumberOfWorkers]
<Int32> [[-AutoTerminationMinutes] <Int32>] [[-Spark_conf] <Hashtable>] [[-CustomTags] <Hashtable>]
[[-InitScripts] <String[]>] [[-SparkEnvVars] <Hashtable>] [-UniqueNames] [-Update] [[-PythonVersion] <String>]
[[-ClusterLogPath] <String>] [[-InstancePoolId] <String>] [<CommonParameters>]
DESCRIPTION
Creates a new cluster
PARAMETERS
-BearerToken <String>
Your Databricks Bearer token to authenticate to your workspace (see User Settings in Databricks WebUI)
Required? false
Position? 1
Default value
Accept pipeline input? false
Accept wildcard characters? false
-Region <String>
Azure Region - must match the URL of your Databricks workspace, example northeurope
Required? false
Position? 2
Default value
Accept pipeline input? false
Accept wildcard characters? false
-ClusterName <String>
Required? true
Position? 3
Default value
Accept pipeline input? false
Accept wildcard characters? false
-SparkVersion <String>
Spark version for cluster. Example: 5.3.x-scala2.11
See Get-DatabricksSparkVersions
Required? false
Position? 4
Default value
Accept pipeline input? false
Accept wildcard characters? false
-NodeType <String>
Type of worker for cluster. Example: Standard_D3_v2
See Get-DatabricksNodeTypes
Required? false
Position? 5
Default value
Accept pipeline input? false
Accept wildcard characters? false
-DriverNodeType <String>
Type of Driver for cluster. Example: Standard_D3_v2. If not set it will default to $NodeType
See Get-DatabricksNodeTypes
Required? false
Position? 6
Default value
Accept pipeline input? false
Accept wildcard characters? false
-MinNumberOfWorkers <Int32>
Min number of workers for cluster that will run the job. If the same as $MaxNumberOfWorkers autoscale is
disabled.
Required? true
Position? 7
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-MaxNumberOfWorkers <Int32>
Max number of workers for cluster that will run the job. If the same as $MinNumberOfWorkers autoscale is
disabled.
Required? true
Position? 8
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-AutoTerminationMinutes <Int32>
Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster
will not be automatically terminated.
If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to
explicitly disable automatic termination.
Required? false
Position? 9
Default value 0
Accept pipeline input? false
Accept wildcard characters? false
-Spark_conf <Hashtable>
Hashtable.
Example @{"spark.speculation"=$true; "spark.streaming.ui.retainedBatches"= 5}
Required? false
Position? 10
Default value
Accept pipeline input? false
Accept wildcard characters? false
-CustomTags <Hashtable>
Custom Tags to set, provide hash table of tags. Example: @{CreatedBy="SimonDM";NumOfNodes=2;CanDelete=$true}
Required? false
Position? 11
Default value
Accept pipeline input? false
Accept wildcard characters? false
-InitScripts <String[]>
Init scripts to run post creation. As array of strings - paths must be full dbfs paths. Example:
"dbfs:/script/script1", "dbfs:/script/script2"
Required? false
Position? 12
Default value
Accept pipeline input? false
Accept wildcard characters? false
-SparkEnvVars <Hashtable>
A hashtable containing a set of optional, user-specified environment variable key-value pairs. Key-value pairs
of the form (X,Y) are exported as is (i.e., export X='Y') while launching the driver and workers.
Example: @{SPARK_WORKER_MEMORY="29000m";SPARK_LOCAL_DIRS="/local_disk0"}
Required? false
Position? 13
Default value
Accept pipeline input? false
Accept wildcard characters? false
-UniqueNames [<SwitchParameter>]
Switch. By default Databricks allows duplicate cluster names. By setting this switch a check will be completed
to see if this cluster exists.
If it does exist an error will be thrown making the script idempotent. Defaults to False.
Required? false
Position? named
Default value False
Accept pipeline input? false
Accept wildcard characters? false
-Update [<SwitchParameter>]
Switch. If the cluster name exist then update the configuration to this one. Defaults to False.
Required? false
Position? named
Default value False
Accept pipeline input? false
Accept wildcard characters? false
-PythonVersion <String>
2 or 3 - defaults to 3.
Required? false
Position? 14
Default value 3
Accept pipeline input? false
Accept wildcard characters? false
-ClusterLogPath <String>
DBFS Location for Cluster logs - must start with dbfs:/
Example dbfs:/logs/mycluster
Required? false
Position? 15
Default value
Accept pipeline input? false
Accept wildcard characters? false
-InstancePoolId <String>
If you would liek to use nodes from an instance pool set the pool id
https://docs.azuredatabricks.net/user-g ... ance-pools
Required? false
Position? 16
Default value
Accept pipeline input? false
Accept wildcard characters? false
<CommonParameters>
This cmdlet supports the common parameters: Verbose, Debug,
ErrorAction, ErrorVariable, WarningAction, WarningVariable,
OutBuffer, PipelineVariable, and OutVariable. For more information, see
about_CommonParameters (https:/go.microsoft.com/fwlink/?LinkID=113216).
INPUTS
OUTPUTS
NOTES
Author: Simon D'Morias / Data Thirst Ltd
RELATED LINKS