Skip to main content

Setting Up Databricks Private Link on AWS Cloud

This document provides a step-by-step guide for configuring AWS PrivateLink for Databricks, ensuring secure connectivity with Chaos Genius to access the Databricks warehouse for querying the metadata.

Requirements

Before proceeding, ensure you have:

  • Administrative access to your Databricks Account.
  • Databricks workspace deployed in a customer-managed VPC
  • Databricks workspace in an AWS region that supports the E2 version of the platform, and not the us-west-1 region

Configuration Steps

Step 1: Get the Databricks Workspace Details

  1. Sign in to your Databricks workspace
  2. Navigate to Admin Settings > Workspace Settings
  3. Note down the following information:
    • Databricks workspace URL
    • AWS Region where your workspace is deployed

Step 2: Share Configuration with Chaos Genius Team

  1. Provide the following information to the Chaos Genius team:
    • Databricks workspace URL
    • AWS Region
    • AWS Account ID
  2. The Chaos Genius team will:
    • Create the AWS PrivateLink endpoint
    • Configure necessary network settings
    • Set up required DNS configurations
    • Notify you with VPC Endpoint once the setup is complete
  1. Sign in to your Databricks workspace
  2. Navigate to Cloud Resources > VPC Endpoints
  3. Register the VPC Endpoint ID provided by the Chaos Genius team to establish the secure connection

Create Data Source Connection in Chaos Genius

To connect your Databricks warehouse to Chaos Genius, navigate to Data Sources in Chaos Genius, select Databricks as your data source, and follow the Databricks connection setup guide.

This setup ensures a secure, private, and reliable connection between your Databricks warehouse and Chaos Genius using AWS PrivateLink. All data traffic will remain within the AWS network, providing enhanced security and compliance.