Sync Docs
Sync HomeLaunch GradientBook Demo
  • Sync Gradient
    • The Gradient Platform
      • How Does it Work?
    • Discover Quickstart
    • Add Workspace
      • Create Sync API Key
      • Add Databricks Workspace
        • AWS Databricks Setup
          • EventBridge Setup
        • Azure Databricks Setup
      • Webhook Setup
    • Project Setup
      • Import Jobs to Projects
      • Verify and Run Jobs
      • Generate and Apply Recommendation
    • Advanced Use Cases
      • Install the Sync-CLI
      • Manual Workspace Setup
        • AWS Instance Profile
      • Apache Airflow for Databricks
      • Gradient Terraform Integration
    • Project Settings
    • Account Settings
    • ROI Reporting
    • FAQ
  • Tutorials & Best Practices
    • Running Gradient in Production
      • Production Auto-Enabled
      • Optimization Windows
      • Development Clones
    • Demos
  • Developer Docs
    • Resources
    • Sync Python Library
    • Gradient CLI Walkthrough
  • Security
    • Privacy and Security Compliance
  • Trust Center
    • Portal
  • Product Announcements
    • Product Updates
  • Need Help?
    • Troubleshooting Guide
Powered by GitBook
On this page
  • Databricks Workspace and Job Configurations Used by Gradient
  • Databricks Workspace Webhook Notification Destination
  • Databricks Job Notifications
  • Managing Terraform Drift
  • Ignore the Entire Cluster Configuration
  • Ignore only the Cluster Configurations Managed by Gradient
  • Apply All Recommendations to Terraform
  • Auto-Apply Recommendations

Was this helpful?

Export as PDF
  1. Sync Gradient
  2. Advanced Use Cases

Gradient Terraform Integration

Integrating Gradient into your Terraform process typically involves the following steps:

  1. Include Workspace and Job Configuration in your Terraform Plan

  2. Configure Terraform to ignore recommendation fields when detecting drift

  3. Let Gradient “auto-apply” recommendations directly to your Databricks Job via Databricks API

Databricks Workspace and Job Configurations Used by Gradient

Gradient utilizes Databricks webhook notification destinations to be notified upon the start of managed Databricks Jobs. Each notification destination should be incorporated into your infrastructure management process to maintain Gradient configuration within your Databricks workspace definition. See example.

Databricks Workspace Webhook Notification Destination

resource "databricks_notification_destination" "sync_webhook" {
  display_name = "Notification Destination"
  config {
    generic_webhook {
      url      = "https://example.com/webhook"
      username = "username" // Optional
      password = "password" // Optional
    }
  }
}

Additionally, each workflow cluster being managed by Gradient should reference this webhook.

Databricks Job Notifications

resource "databricks_job" "example_job" {
  name = "example job"
  ...
  webhook_notifications {
    on_start {
      id = databricks_notification_destination.sync_webhook.id
    }
  }

Managing Terraform Drift

If you are using terraform plan to tell you when there is a configuration drift of resources created by Terraform, we recommend you use one of the following methods to omit Databricks Job cluster configurations generated by Gradient. This will avoid the most recent cluster configuration from being overwritten by Terraform.

Ignore the Entire Cluster Configuration

Specifying ‘ignore_chages = all’ under ‘lifecycle’ definition of the entire cluster configuration will result in the entire cluster configuration being ignored by the drift detection process.

resource "databricks_cluster" "single_node" {
  cluster_name            = "Single Node"
  spark_version           = data.databricks_spark_version.latest_lts.id
  node_type_id            = data.databricks_node_type.smallest.id
  autotermination_minutes = 20

  spark_conf = {
    # Single-node
    "spark.databricks.cluster.profile" : "singleNode"
    "spark.master" : "local[*]"
  }

  custom_tags = {
    "ResourceClass" = "SingleNode"
  }

  lifecycle {
    ignore_changes = all
  }

Ignore only the Cluster Configurations Managed by Gradient

Explicitly specifying which configurations to ignore allows configurations not managed by Gradient to be evaluated by the drift detection process. However, it is important to notes that these configurations may change as new features are added to Gradient.

resource "databricks_cluster" "example_cluster" {
  cluster_name  = "example-cluster"
  spark_version = "7.3.x-scala2.12"
  node_type_id  = "i3.xlarge"
  num_workers   = 2


  custom_tags = {
    "sync:project-id" = "<insert-project-id>" # customer needs to add their project id tag
    # ...other tags
  }
  # other configurations...

  lifecycle {
    ignore_changes = [ # Fields sync modifies
      num_workers,
      node_type_id
      # Note: Gradient also modifies EBS volumes
    ]
  }
}

Apply All Recommendations to Terraform

sync.api.projects.get_latest_project_config_recommendation(project_id: str) → Optional[sync.api.projects.Response[dict]]

Get Latest Project Configuration Recommendation.

Parameters
project_id (str) – project ID

Returns
Project Configuration Recommendation object

Return type
Response object or None

This function returns a Python dictionary containing the recommended cluster configuration for the project. Parse and persist this data in the format required by your infrastructure management process.

Auto-Apply Recommendations

To avoid manually applying recommendations, you can also enable Auto-Apply in the "Edit settings" button in the Gradient project page. If this option is enabled, recommendations will be automatically applied after each run of your job.

The Auto-Apply setting is applicable only to Databricks Workflows.

PreviousApache Airflow for DatabricksNextProject Settings

Last updated 6 months ago

Was this helpful?

If you choose not "ignore changes" and want to reintegrate the recommendations back into their terraform resource, you can retrieve the latest recommendation using the following function in the :

This setting is applicable only to Databricks Workflows. Auto-Apply is not applicable if you're using the or Databricks API.

Auto-Apply is not applicable if you're using the or Databricks API.

Sync Python Library
DatabricksSubmitRunOperator
/api/2.1/jobs/runs/submit
DatabricksSubmitRunOperator
/api/2.1/jobs/runs/submit