cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

lw2
by New Contributor
  • 458 Views
  • 1 replies
  • 0 kudos

Read sqlite file from s3 bucket into databricks, creating delta tables

I have a sqlite database that I want to read into databricks to create delta tables/dataframes in Python that I can export to power BI and have a live connection. When there is new data added to my sqlite data base, the changes will need to reflect i...

  • 458 Views
  • 1 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @lw2, The approach you have been using (copy SQLite to local, read with sqlite3, export to CSV, then manually create tables) works as a one-shot load but, as you noticed, it does not give you an easy path to keep things in sync. Below is a streaml...

  • 0 kudos
Charansai
by New Contributor III
  • 592 Views
  • 3 replies
  • 0 kudos

Notebooks Not Deploying in Development Mode Using Databricks Asset Bundles (Deploying from Workspace

Hi everyone,I’m using Databricks Asset Bundles and running into an issue when deploying to my dev environment in development mode. Even though my bundle includes sync paths and notebook directories, the deployment only creates the .databricks/artifac...

  • 592 Views
  • 3 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @Charansai, The behavior you are seeing is actually expected when deploying a bundle from the Databricks Workspace Editor. Here is why. SOURCE-LINKED DEPLOYMENT When you deploy a bundle from within the workspace (as opposed to using the Databricks...

  • 0 kudos
2 More Replies
dtb_usr
by New Contributor III
  • 1073 Views
  • 10 replies
  • 1 kudos

Resolved! SELECT Permission error when reading materialised views associated with a pipeline

I am having to pass ownership of pipelines to users for them to read materialised views associated with any pipeline otherwise they get a 'User does not have SELECT on table...' error. This is obviously bonkers as any pipeline can only have one owner...

  • 1073 Views
  • 10 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

Hi @dtb_usr, Based on the error message and the fact that the query works in the SQL Editor (which uses a SQL warehouse) but fails on a personal/dedicated cluster in notebooks, this is almost certainly a compute access mode issue rather than a Unity ...

  • 1 kudos
9 More Replies
Seunghyun
by Contributor
  • 1009 Views
  • 2 replies
  • 0 kudos

Resolved! Deploy dashboard with asset bundle

Hello, I have some questions regarding dashboard development using Asset Bundles.I have been following the procedure for developing dashboards by referring to this page: Databricks CI/CD for Dashboard Developers.Here is the workflow I followed:Create...

  • 1009 Views
  • 2 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @Seunghyun, This is a common workflow question when getting started with AI/BI Dashboard deployment through Databricks Asset Bundles. Here is a walkthrough of the recommended approach to maintain a single dashboard and handle ongoing modifications...

  • 0 kudos
1 More Replies
lw2
by New Contributor
  • 513 Views
  • 3 replies
  • 0 kudos

Read Sqlite file in to create delta table/dataframe with live connection

I have a sqlite database that I want to read into databricks to create delta tables/dataframes in Python that I can export to power BI and have a live connection. When there is new data added to my sqlite data base, the changes will need to reflect i...

  • 513 Views
  • 3 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @lw2, The key to getting a "live connection" end-to-end is replacing the manual CSV export with a scheduled pipeline that writes directly to Delta tables, then connecting Power BI to those Delta tables via DirectQuery. Here is a complete approach....

  • 0 kudos
2 More Replies
ChristianRRL
by Honored Contributor
  • 1034 Views
  • 4 replies
  • 0 kudos

Asset Bundles Overriding Existing Jobs (despite different name_prefix)

Hi there, I'm seeing what seems to be unexpected behavior on databricks asset bundle deployment and I'm hoping I can get clarification on this.Basically, what I'm trying to do is to deploy the same asset bundle twice (two different variations), with ...

ChristianRRL_1-1769815680547.png ChristianRRL_0-1769815600150.png
  • 1034 Views
  • 4 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @ChristianRRL, This behavior comes down to how Databricks Asset Bundles track deployed resources using Terraform state, and specifically where that state is stored locally. HOW BUNDLE STATE TRACKING WORKS When you run "databricks bundle deploy", t...

  • 0 kudos
3 More Replies
RutujaKadam
by New Contributor II
  • 436 Views
  • 2 replies
  • 1 kudos

Getting Error when connecting azure databricks to azure sql server using lakeflow connect

Hi, Can anyone please let me know how to resolve this error . I am trying to connect azure sql server to azure databricks using lakeflow connect data ingestion. I am able to create the connection but afterwards it gives me error as :Error starting ga...

  • 436 Views
  • 2 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

Hi @RutujaKadam, The error you are seeing, "Error starting gateway compute resources" with a message about VM quota exhaustion, is related to your Azure subscription's vCPU quota rather than a misconfiguration in Databricks itself. Here is what is ha...

  • 1 kudos
1 More Replies
bts136
by Databricks Partner
  • 1712 Views
  • 2 replies
  • 1 kudos

Reading Excel files with Spark returns formula values instead of computed values

Hi,I'm seeing inconsistent behavior when reading Excel files using the built-in connector Lakeflow Connector with spark.read.format("excel") (doc: https://docs.databricks.com/aws/en/query/formats/excel). I read an .xlsx file from S3 using this functi...

  • 1712 Views
  • 2 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

Hi @bts136, This behavior is related to how Excel files store formula results internally, and it is something you can work around. BACKGROUND: HOW EXCEL STORES FORMULAS Excel files (.xlsx) store both the formula text and a cached computed result for ...

  • 1 kudos
1 More Replies
kenmyers-8451
by Contributor II
  • 346 Views
  • 2 replies
  • 1 kudos

Need for additional flow control (shortcomings with "run if dependencies")

Maybe there is a way to do this that my team can't figure out but we have a process that kind of looks like this:The main focus of this job1 and job2 but in theory let's extend this issue to any number of linear jobs. So the idea behind this is:job2 ...

kenmyers8451_0-1769797889919.png kenmyers8451_1-1769798621711.png
  • 346 Views
  • 2 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

Hi @kenmyers-8451, You are describing a real gap in how "Run if dependencies" interacts with If/else condition task outcomes, and I want to walk through the current tooling so you can find a workable pattern. UNDERSTANDING THE BEHAVIOR When an If/els...

  • 1 kudos
1 More Replies
Smriti2
by New Contributor II
  • 433 Views
  • 3 replies
  • 0 kudos

Can we add a column comments for a materialized view on Azure Databricks?

I want to understand whether it’s possible to add or update column comments on an existing materialized view in Azure Databricks, and if so, what command should be used—especially when updating comments for multiple columns at once. Here’s my situati...

  • 433 Views
  • 3 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @Smriti2, Yes, you can absolutely add column comments to a materialized view on Azure Databricks. There are two approaches you can use. OPTION 1: ALTER MATERIALIZED VIEW (Recommended for Existing Views) Since your materialized view already exists,...

  • 0 kudos
2 More Replies
Subra2025
by New Contributor II
  • 576 Views
  • 3 replies
  • 1 kudos

Databricks Genie dashboard promote from Genie TEST workspace to another

Hi,We have manually migrated powerBI dashboard components to Databricks Genie TEST workspace.what is the procedure or approaches to promote these Genie TEST workspace components to Databricks Genie PROD workspace?ThanksSubra

  • 576 Views
  • 3 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

Hi @Subra2025, There are two separate things to consider here: promoting AI/BI Dashboards and promoting Genie Spaces. They have different promotion paths, so I will cover both. PROMOTING AI/BI DASHBOARDS ACROSS WORKSPACES AI/BI Dashboards have mature...

  • 1 kudos
2 More Replies
theunwoke
by New Contributor
  • 189 Views
  • 2 replies
  • 0 kudos

Data Load from S3 Frankfurt Region to Unity catalog in AWS USWest Region

Hello,I am trying to bring the Parquet data from S3 to the Unity CatalogCurrently I am doing straight forward read and write like this test_data = spark.read.schema(1_billion_data).parquet(s3_path) test_data.repartition(num_cores*2).write.mode("overw...

  • 189 Views
  • 2 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @theunwoke, The 1h45m you are seeing is heavily influenced by cross-region network transfer between eu-central-1 (Frankfurt) and us-west (your workspace region). S3 reads that cross AWS regions go over the public internet backbone, so throughput p...

  • 0 kudos
1 More Replies
IM_01
by Contributor III
  • 524 Views
  • 4 replies
  • 0 kudos

how to use rules dynamically in LDP

HiI see there is a way to store rules in table & use them in python while implementing LDPs how to use the generate/ read rules dynamically in SQL way of implementing LDPs. Could you please help me with this#DLT

  • 524 Views
  • 4 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @IM_01, The feature you are looking for, storing data quality rules in a table and applying them dynamically, is fully supported in Lakeflow Spark Declarative Pipelines (SDP) through the Python API. Unfortunately, there is currently no equivalent ...

  • 0 kudos
3 More Replies
smpa01
by Contributor
  • 706 Views
  • 3 replies
  • 3 kudos

Configure SAS Token for ADLS Access in Databricks Job (Works on Classic Cluster, Fails on Serverless

I am running a Databricks job that reads from a Delta table and writes to an ADLS Gen2 location using a SAS token for authentication.from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() sas_token = dbutils.secrets.get(scop...

smpa01_0-1769708393025.png
  • 706 Views
  • 3 replies
  • 3 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 3 kudos

Hi @smpa01, The reason this works on a classic cluster but fails on serverless is that serverless compute only supports a very limited set of Spark configuration properties. The fs.azure.sas.* Hadoop configurations you are setting via spark.conf.set ...

  • 3 kudos
2 More Replies
Garybary
by New Contributor III
  • 853 Views
  • 3 replies
  • 2 kudos

Scheduling jobs with table update triggers

Hi all,Lately I've been experimenting with the newish feature of scheduling jobs on a table update trigger. There's one thing thats blokcing me from implementing it however and I was hoping someone found a solution to it.We occasionally perform a vac...

  • 853 Views
  • 3 replies
  • 2 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 2 kudos

Hi @Garybary, This is a common scenario when using table update triggers. Currently, table update triggers do not support filtering by operation type. The trigger fires on any commit to the Delta transaction log, and VACUUM does write a commit entry ...

  • 2 kudos
2 More Replies
Labels