Databricks Azure Managed Identity

Unoffical use of the databricks user assigned managed identity in Azure

managedidentity
Handle edgecase access control by utilizing the user assigned Managed Identity in your Azure Databricks workspace.
Modified

05/12/2024

Summary

  • Azure Databricks workspaces come with a built-in User Assigned Managed Identity called dbmanagedidentity
  • The Identity is located in the workspace’s managed resource group
  • Leveraging the Identity simplifies access control for certain use cases and enhances security.
Unofficial Feature

Please note that using the Databricks Managed Identity is not officially supported in the documentation. However, due to the architecture of Databricks in Azure, it’s unlikely to change in the immediate future. Use this feature at your own risk, as it comes with no warranty.

Understanding Databricks Managed Identity

A User-Assigned Managed Identity is an Azure Entra ID identity that can be used to authenticate and authorize access to Azure resources. It is created as a standalone Azure resource and can be assigned to one or more Azure services.

In Azure Databricks, each workspace has a built-in User-Assigned Managed Identity named dbmanagedidentity. This identity is located in the workspace’s managed resource group.

A Managed Resource Group is an Azure Resource Group that is automatically created and managed by Azure Databricks. It contains the necessary resources for the workspace, such as the Managed Identity, storage accounts, and virtual network.

Use Cases for the Databricks Identity

The Databricks User Assigned Managed Identity dbmanagedidentity can be particularly useful in scenarios where:

  1. You need to authenticate to a specific Azure resource but cannot use a secret scope to handle credentials (e.g., when too many secrets would be exposed to users via a Key Vault).
  2. You want to assign more granular RBAC roles to the Managed Identity for fine-grained access control vs an SPN.
  3. You want to simplify the authentication process and avoid managing separate credentials for each Azure service.

Connecting to Azure Services

To access the Managed Identity in your Databricks notebook, use the following code snippet:

from azure.identity import ManagedIdentityCredential

credential = ManagedIdentityCredential()

Azure Service Integration Examples

  1. Azure Key Vault:

    from azure.keyvault.secrets import SecretClient
    secret_client = SecretClient(vault_url="https://your-vault.vault.azure.net/", credential=credential)
    secret = secret_client.get_secret("your-secret-name")
  2. Azure Storage:

    from azure.storage.blob import BlobServiceClient
    blob_service_client = BlobServiceClient(account_url="https://your-account.blob.core.windows.net", credential=credential)
  3. Azure Cosmos Database:

    from azure.cosmos import CosmosClient
    client = CosmosClient("https://your-account.documents.azure.com:443/", credential=credential)
  4. Azure Event Hubs:

    from azure.eventhub import EventHubProducerClient
    producer = EventHubProducerClient("your-event-hub-namespace.servicebus.windows.net", "your-event-hub", credential=credential)

    Assigning Permissions to the Managed Identity

Before using the Managed Identity to access Azure services, you must assign the appropriate permissions to the identity in the target service.

To find your Databricks Managed Identity:

  1. Navigate to your Databricks workspace in the Azure portal
  2. Click on the “Resource group” link to open the managed resource group
  3. Locate the “User Assigned Identities” resource with the name dbmanagedidentity

Once you have identified the Managed Identity, follow the documentation for each Azure service to assign the necessary permissions.

Conclusion

By leveraging the built-in Managed Identity in your Azure Databricks workspace, you can simplify access control and enhance security when connecting to various Azure services. This approach eliminates the need for managing separate credentials and provides a seamless authentication experience.

Remember to assign the appropriate permissions to the Managed Identity in each target Azure service before using it in your Databricks notebooks.

Back to top