Databricks Azure Managed Identity
Unoffical use of the databricks user assigned managed identity in Azure
Summary
- Azure Databricks workspaces come with a built-in User Assigned Managed Identity called
dbmanagedidentity
- The Identity is located in the workspace’s managed resource group
- Leveraging the Identity simplifies access control for certain use cases and enhances security.
Please note that using the Databricks Managed Identity is not officially supported in the documentation. However, due to the architecture of Databricks in Azure, it’s unlikely to change in the immediate future. Use this feature at your own risk, as it comes with no warranty.
Understanding Databricks Managed Identity
A User-Assigned Managed Identity is an Azure Entra ID identity that can be used to authenticate and authorize access to Azure resources. It is created as a standalone Azure resource and can be assigned to one or more Azure services.
In Azure Databricks, each workspace has a built-in User-Assigned Managed Identity named dbmanagedidentity
. This identity is located in the workspace’s managed resource group.
A Managed Resource Group is an Azure Resource Group that is automatically created and managed by Azure Databricks. It contains the necessary resources for the workspace, such as the Managed Identity, storage accounts, and virtual network.
Use Cases for the Databricks Identity
The Databricks User Assigned Managed Identity dbmanagedidentity
can be particularly useful in scenarios where:
- You need to authenticate to a specific Azure resource but cannot use a secret scope to handle credentials (e.g., when too many secrets would be exposed to users via a Key Vault).
- You want to assign more granular RBAC roles to the Managed Identity for fine-grained access control vs an SPN.
- You want to simplify the authentication process and avoid managing separate credentials for each Azure service.
Connecting to Azure Services
To access the Managed Identity in your Databricks notebook, use the following code snippet:
from azure.identity import ManagedIdentityCredential
= ManagedIdentityCredential() credential
Azure Service Integration Examples
Azure Key Vault:
from azure.keyvault.secrets import SecretClient = SecretClient(vault_url="https://your-vault.vault.azure.net/", credential=credential) secret_client = secret_client.get_secret("your-secret-name") secret
Azure Storage:
from azure.storage.blob import BlobServiceClient = BlobServiceClient(account_url="https://your-account.blob.core.windows.net", credential=credential) blob_service_client
Azure Cosmos Database:
from azure.cosmos import CosmosClient = CosmosClient("https://your-account.documents.azure.com:443/", credential=credential) client
Azure Event Hubs:
from azure.eventhub import EventHubProducerClient = EventHubProducerClient("your-event-hub-namespace.servicebus.windows.net", "your-event-hub", credential=credential) producer
Assigning Permissions to the Managed Identity
Before using the Managed Identity to access Azure services, you must assign the appropriate permissions to the identity in the target service.
To find your Databricks Managed Identity:
- Navigate to your Databricks workspace in the Azure portal
- Click on the “Resource group” link to open the managed resource group
- Locate the “User Assigned Identities” resource with the name
dbmanagedidentity
Once you have identified the Managed Identity, follow the documentation for each Azure service to assign the necessary permissions.
Conclusion
By leveraging the built-in Managed Identity in your Azure Databricks workspace, you can simplify access control and enhance security when connecting to various Azure services. This approach eliminates the need for managing separate credentials and provides a seamless authentication experience.
Remember to assign the appropriate permissions to the Managed Identity in each target Azure service before using it in your Databricks notebooks.