cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

JameDavi_51481
by Contributor
  • 9853 Views
  • 11 replies
  • 13 kudos

Can we add tags to Unity Catalog through Terraform?

We use Terraform to manage most of our infrastructure, and I would like to extend this to Unity Catalog. However, we are extensive users of tagging to categorize our datasets, and the only programmatic method I can find for adding tags is to use SQL ...

  • 9853 Views
  • 11 replies
  • 13 kudos
Latest Reply
jlieow
Databricks Employee
  • 13 kudos

In case anyone comes across this, have a look at databricks_entity_tag_assignment and see if it suits your needs.

  • 13 kudos
10 More Replies
DataGirl
by New Contributor
  • 16328 Views
  • 7 replies
  • 2 kudos

Multi value parameter on Power BI Paginated / SSRS connected to databricks using ODBC

Hi All, I'm wondering if anyone has had any luck setting up multi valued parameters on SSRS using ODBC connection to Databricks? I'm getting "Cannot add multi value query parameter" error everytime I change my parameter to multi value. In the query s...

  • 16328 Views
  • 7 replies
  • 2 kudos
Latest Reply
kashti123
New Contributor
  • 2 kudos

Hi I am also trying to set multi value parameters using the dynamic sql expression. However, the report gives error that multi value parameters are not supported by the data extension. Any help on this would be highly appreciated. Thanks , Drishti

  • 2 kudos
6 More Replies
kcyugesh
by New Contributor
  • 63 Views
  • 1 replies
  • 1 kudos

Resolved! Delta live table not showing in workspace (Azure databricks with premium plan)

- I have a premium plan and owner level access 

Screenshot 2025-11-07 at 12.15.29 PM.png Screenshot 2025-11-07 at 12.22.33 PM.png
  • 63 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @kcyugesh ,They changed the name from DLT to Lakeflow Declartive Pipelines, so you won't find DLT name in UI.Click job & pipelines and then ETL pipeline to access declarative pipeline editior 

  • 1 kudos
DE5
by New Contributor
  • 41 Views
  • 1 replies
  • 1 kudos

Unable to see the Assistant suggested code and current code side by side

Hi,I'm unable to see the Assistant suggested code and current code side by side. Previously I'm able to see the my code and Assistant suggested code side by side which helped me to understand the changes. Please suggest if there is any ways for it. T...

  • 41 Views
  • 1 replies
  • 1 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 1 kudos

@DE5 Some recent updates moved comparison features into the SQL Editor side panel or rely on “Cell Actions,” where you can generate code or format it and then see differences before applying changeshttps://www.databricks.com/blog/introducing-new-sql-...

  • 1 kudos
Dhruv-22
by Contributor II
  • 335 Views
  • 7 replies
  • 6 kudos

Reading empty json file in serverless gives error

I ran a databricks notebook to do incremental loads from files in raw layer to bronze layer tables. Today, I encountered a case where the delta file was empty. I tried running it manually on the serverless compute and encountered an error.df = spark....

  • 335 Views
  • 7 replies
  • 6 kudos
Latest Reply
K_Anudeep
Databricks Employee
  • 6 kudos

Hello @Dhruv-22 , Can you share the schema of the df? Do you have a _corrupt_record column in your dataframe? If yes.. where are you getting it from, because you said its an empty file correct?As per the design ,Spark blocks queries that only referen...

  • 6 kudos
6 More Replies
dbdev
by Contributor
  • 141 Views
  • 3 replies
  • 1 kudos

Lakehouse Federation - fetch size parameter for optimization

Hi,We use lakehouse federation to connect to a database.A performance recommendation is to use 'fetchSize':Lakehouse Federation performance recommendations - Azure Databricks | Microsoft Learn SELECT * FROM mySqlCatalog.schema.table WITH ('fetchSiz...

  • 141 Views
  • 3 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hello @dbdev , I did some digging and here are some suggestions. The `fetchSize` parameter in Lakehouse Federation is currently only available through SQL syntax using the `WITH` clause, as documented in the performance recommendations. Unfortunately...

  • 1 kudos
2 More Replies
databricksero
by New Contributor II
  • 111 Views
  • 2 replies
  • 3 kudos

Resolved! Databricks Bundle Validation Error After CLI Upgrade (0.274.0 → 0.276.0)

After upgrading the Databricks CLI from version 0.274.0 to 0.276.0, bundle validation is failing with an error indicating that my configuration is formatted for "open-source Spark Declarative Pipelines" while the CLI now only supports "Lakeflow Decla...

  • 111 Views
  • 2 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @databricksero ,It's a bug. I've checked and the PR fixing this bug is already merged to main branch. Check below github thread and then once they build new release just update databricks CLI (soon they should release version without bug). Fix oss...

  • 3 kudos
1 More Replies
Y_WANG
by New Contributor
  • 67 Views
  • 1 replies
  • 0 kudos

Want to use DataFrame equality functions but also Numpy >= 2.0

In my team, we has a lot of Data science workflow using Spark and Pandas. In order to rassure the stability of workflows, we need to implement the unit test. Recently, I found out the DataFrame equality test functions introduced in Spark 3.5 which se...

  • 67 Views
  • 1 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 0 kudos

@Y_WANG  The root cause of the AttributeError you face when importing assertDataFrameEqual from pyspark.testing in Spark 3.5 is due to Spark's code using the deprecated np.NaN attribute, which was removed in NumPy 2.0 (replaced by np.nan). This break...

  • 0 kudos
der
by Contributor II
  • 150 Views
  • 5 replies
  • 1 kudos

EXCEL_DATA_SOURCE_NOT_ENABLED Excel data source is not enabled in this cluster

I want to read an Excel xlsx file on DBR 17.3. On the Cluster the library dev.mauch:spark-excel_2.13:4.0.0_0.31.2 is installed. V1 Implementation works fine:df = spark.read.format("dev.mauch.spark.excel").schema(schema).load(excel_file) display(df)V2...

  • 150 Views
  • 5 replies
  • 1 kudos
Latest Reply
mmayorga
Databricks Employee
  • 1 kudos

hi @der  First of all thank you for your patience and for providing more information about your case. Use of ".format("excel")" I replicated equally your cluster config in Azure. Without installing any library, I was able to run and load the xlsx fil...

  • 1 kudos
4 More Replies
erigaud
by Honored Contributor
  • 3001 Views
  • 10 replies
  • 8 kudos

Databricks asset bundles and Dashboards - pass parameters depending on bundle target

Hello everyone !Since Databricks Asset Bundles can now be used to deploy dashboards, I'm wondering how to pass parameters so that the queries for the dev dashboard query the dev catalog, and the dashboard in stg query the stg catalog etc.Is there any...

  • 3001 Views
  • 10 replies
  • 8 kudos
Latest Reply
Coffee77
Contributor
  • 8 kudos

What I did as a workaround. It works pretty fine but you'll need to duplicate Dashboard JSON code per environment and then, replace catalog names  It is not the perfect solution but the only way I could find to include these deployment in my Databric...

  • 8 kudos
9 More Replies
bidek56
by Contributor
  • 137 Views
  • 3 replies
  • 0 kudos

Location of spark.scheduler.allocation.file

In DBR 164.LTS, I am trying to add the following Spark config: spark.scheduler.allocation.file: file:/Workspace/init/fairscheduler.xmlBut the all purpose cluster is throwing this error Spark error: Driver down cause: com.databricks.backend.daemon.dri...

  • 137 Views
  • 3 replies
  • 0 kudos
Latest Reply
bidek56
Contributor
  • 0 kudos

@mark_ott Setting WSFS_ENABLE=false does not effect anything. Thx

  • 0 kudos
2 More Replies
LBISWAS
by New Contributor
  • 51 Views
  • 1 replies
  • 0 kudos

Search result shows presence of a text in notebook, but its not present in notebook

Search result shows presence of a text in notebook, but its not present in notebook

  • 51 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

Ah yes a classic.  The search also looks into hidden/collapsed content which is not visible.F.e. results or metadata.

  • 0 kudos
02CSE33
by New Contributor
  • 105 Views
  • 2 replies
  • 0 kudos

Migrating SQL Server Tables and Views to Databricks using Lakebridge

We have a requirement to carry out migration of few 100 tables which are present in SQL Server to Databricks Delta Table. We intend to explore Lakebridge capability for carrying out a PoC for this. We also want to migrate few historic records say las...

  • 105 Views
  • 2 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

Migrating several hundred SQL Server tables to Databricks Delta Lake, using Lakebridge for a Proof of Concept (PoC), can be approached with custom pipelines—especially for filtering by a date/time column to migrate only the last two years of data. Of...

  • 0 kudos
1 More Replies
gudurusreddy99
by New Contributor II
  • 82 Views
  • 1 replies
  • 1 kudos

DLT or DP: How to do full refresh of Delta table from DLT Pipeline to consider all records from Tbl

RequirementI have a Kafka streaming pipeline that ingests Pixels data. For each incoming record, I need to validate the Pixels key against an existing Delta table (pixel_tracking_data), which contains over 2 billion records accumulated over the past ...

  • 82 Views
  • 1 replies
  • 1 kudos
Latest Reply
mark_ott
Databricks Employee
  • 1 kudos

Matching streaming data in real time against a massive, fast-changing Delta table requires careful architectural choices. In your case, latency is high for the most recent records, and the solution only matches against data ≥10 minutes old. This is a...

  • 1 kudos
der
by Contributor II
  • 476 Views
  • 10 replies
  • 0 kudos

Rasterio on shared/standard cluster has no access to proj.db

We try to use rasterio on a Databricks shared/standard cluster with DBR 17.1. Rasterio is directly installed on the cluster as library. Code:import rasterio rasterio.show_versions()Output: rasterio info:rasterio: 1.4.3GDAL: 3.9.3PROJ: 9.4.1GEOS: 3.11...

  • 476 Views
  • 10 replies
  • 0 kudos
Latest Reply
der
Contributor II
  • 0 kudos

Current Workaround:If you select the "Photon" engine on a Standard/Shared Cluster, they change the access rights of /databricks/native/proj-data and rasterio works fine.The downside:Pay for "Photon" compute to use a Python library, which do not use S...

  • 0 kudos
9 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels