Docs
  • LogicLoop Documentation
  • LogicLoop Quickstart
  • Data Sources
    • Connect your data source
      • Deployment options
    • Supported data sources
      • API data (JSON)
      • AWS CloudWatch
      • BigQuery
      • Databricks
      • Google Analytics
      • Google Sheets
      • MongoDB
      • Microsoft SQL Server
      • MySQL
      • Oracle
      • PostgreSQL
      • Snowflake
      • Combine Results
      • LogicLoop Data Source
  • Queries
    • Write a rule
      • More rule options
      • A/B testing rules
      • Version history
      • Query Snippets
    • Visualizations & dashboards
      • Visualizations
      • Dashboards
  • Actions
    • Set up an action
    • Action destinations
      • Email
      • Slack
      • Webhooks & APIs
      • Microsoft Teams
      • Salesforce
      • Zapier
      • PagerDuty
      • Write to Database
      • Chain Rules
      • Google Sheets
      • Bento
    • Templating
  • Case Management
    • Case Management
      • Ticket Generation
      • Case Triage
      • Custom Fields and Attachments
  • AI
    • AI Query Helper
    • Ask AI
  • Teams & Settings
    • Integrations
      • Slack
      • Google Sheets
    • Invite your teammate
    • Groups & Permissions
  • Changelog
  • FAQs
  • Troubleshooting
  • Templates
    • Templates Home
    • Risk & Fraud Rules
    • AML Transaction Monitoring Rules
    • Logistics & Marketplace Ops
    • Customer Success & Growth
    • Systems Observability
    • Data Quality Monitoring
    • Healthcare
    • HTML Email Templates
      • Annual Review
      • Weekly Performance Table
      • Invoice Recap
  • BETA
    • AI SQL API
    • Approving rules
    • Render Data as JSON
    • Case Analytics
    • Python
  • Security & Legal
    • Security
    • AI Security
    • Terms of use
    • Privacy policy
    • Services description
Powered by GitBook
On this page
  • Prerequisites
  • Setup
  • Schema Browser
  • Auto Limit
  • Multiple Statement Support

Was this helpful?

  1. Data Sources
  2. Supported data sources

Databricks

PreviousBigQueryNextGoogle Analytics

Last updated 1 year ago

Was this helpful?

Prerequisites

Setup

LogicLoop can connect to both Databricks clusters and SQL Endpoints. Consult the for how to obtain the Host, HTTP Path, and an Access Token for your endpoint.

Databricks Data Source Setup Screen

Schema Browser

The Databricks query runner uses a custom built schema browser which allows you to switch between databases on the endpoint and see column types for each field.

Unlike other query runners, the Databricks schema browser fetches table and column names on-demand as you navigate from one database to another. If you mostly use one database this will be fine.

But if you explore the schema across multiple databases you may experience delays as each database is fetched separately.

Schemas are cached for one hour. You may wish to schedule a hourly job to warm those caches.

You can do this with any REST API tool as follows:

curl --request GET \
  --url http://<logicloop host>/api/databricks/databases/<data-source-id>/<database-name>/tables?refresh \
  --header 'Authorization: Key <admin-api-key>' \

Auto Limit

The Databricks query runner also includes a checkbox beneath the query editor which will append a LIMIT 1000 statement to your query automatically by default. This helps in case you accidentally run SELECT * FROM some large table with enough results to crash the front-end.

Multiple Statement Support

The Databricks query runner allows you to execute multiple statements terminated with a semicolon ; in one query window.

Only one table of query results can be displayed from a query.

This is useful for setting session / cluster configuration variables prior to executing the query on your cluster.

set use_cached_result = False;

SELECT count(*) FROM some_db.some_table

Databricks Data Source Setup Screen

Pro tip: When you pull connection credentials, connect to your vs . You should see the same data in either place through the .

If you can, connect to a since it's faster and less expensive. Serverless has built-in SQL optimizations for running repeated queries, and will use Databricks-optimized compute resources. If you don't have a Serverless Warehouse, you can use a Pro warehouse.

Databricks SQL Warehouse
All-Purpose Clusters
Unity Catalog
SQL Serverless Warehouse
allows access to LogicLoop
Deployment
admin
Databricks Documentation