Setting Up Databricks Private Link on AWS Cloud
This document provides a step-by-step guide for configuring AWS PrivateLink for Databricks, ensuring secure connectivity with Chaos Genius to access the Databricks warehouse for querying the metadata.
Requirements
Before proceeding, ensure you have:
- Administrative access to your Databricks Account.
- Databricks workspace deployed in a customer-managed VPC
- Databricks workspace in an AWS region that supports the E2 version of the platform, and not the
us-west-1
region
Configuration Steps
Step 1: Get the Databricks Workspace Details
- Sign in to your Databricks workspace
- Navigate to Admin Settings > Workspace Settings
- Note down the following information:
- Databricks workspace URL
- AWS Region where your workspace is deployed
Step 2: Share Configuration with Chaos Genius Team
- Provide the following information to the Chaos Genius team:
- Databricks workspace URL
- AWS Region
- AWS Account ID
- The Chaos Genius team will:
- Create the AWS PrivateLink endpoint
- Configure necessary network settings
- Set up required DNS configurations
- Notify you with VPC Endpoint once the setup is complete
Step 3: Register the PrivateLink Connection
- Sign in to your Databricks workspace
- Navigate to Cloud Resources > VPC Endpoints
- Register the VPC Endpoint ID provided by the Chaos Genius team to establish the secure connection
Create Data Source Connection in Chaos Genius
To connect your Databricks warehouse to Chaos Genius, navigate to Data Sources in Chaos Genius, select Databricks as your data source, and follow the Databricks connection setup guide.
This setup ensures a secure, private, and reliable connection between your Databricks warehouse and Chaos Genius using AWS PrivateLink. All data traffic will remain within the AWS network, providing enhanced security and compliance.