cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

souravroy1990
by New Contributor
  • 21 Views
  • 1 replies
  • 0 kudos

Error in Column level tags creation in views via SQL

Hi,I'm trying to run this query using SQL and using DBR 17.3 cluster. But I get a syntax error. ALTER VIEW catalog.schema.viewALTER COLUMN column_nameSET TAGS (`METADATA` = `xyz`); But below query works- SET TAG ON COLUMN catalog.schema.view.column_n...

  • 21 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @souravroy1990 ,This syntax is just unsupported. Using ALTER VIEW you can set tag for an entire view:If you need to set tag on a column you've already found a proper way which is to use SET TAG command:SET TAG - Azure Databricks - Databricks SQL |...

  • 0 kudos
fintech_latency
by New Contributor
  • 208 Views
  • 9 replies
  • 2 kudos

How to guarantee “always-warm” serverless compute for low-latency Jobs workloads?

We’re building a low-latency processing pipeline on Databricks and are running into serverless cold-start constraints.We ingest events (calls) continuously via a Spark Structured Streaming listener.For each event, we trigger a serverlesss compute tha...

  • 208 Views
  • 9 replies
  • 2 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 2 kudos

@fintech_latency  For streaming: refactor to one long‑running Structured Streaming job with a short trigger interval (for example, 1s) and move “assignment” logic into foreachBatch or a transactional task table updated within the micro‑batch. For per...

  • 2 kudos
8 More Replies
a_user12
by Contributor
  • 192 Views
  • 1 replies
  • 1 kudos

Resolved! Unity Catalog Schema management

From time to time i read  articles such as here which suggest to use a unity catalog schema management tool. All table schema changes should be applied via this tool.Usually SPs (or users) have the "Modify" Permission on tables. This allows to them t...

  • 192 Views
  • 1 replies
  • 1 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 1 kudos

I tend to mostly agree with you. Trying to do table-schema management like I would have done while developing ETL flows in an RDBMS world is quite different from trying to do this in a fast-moving "new-sources-all-the-time" data engineering world.  T...

  • 1 kudos
RIDBX
by Contributor
  • 89 Views
  • 1 replies
  • 1 kudos

Replicating DBX Demo set in Databricks FREE tier?

Replicating DBX Demo set in Databricks FREE tier?===================================================Thanks for reviewing my threads. I like to replicate/port the Databricks Demo artifacts/set in to my personal Databricks FREE tier. I am getting some ...

  • 89 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hello @RIDBX , I’m not surprised that some of the actions you’re seeing through Vocareum don’t work in the Free Edition. When our training is developed, parts of it are intentionally tied to Vocareum APIs to support specific tasks and workflows. Beca...

  • 1 kudos
sebih
by New Contributor II
  • 105 Views
  • 1 replies
  • 1 kudos

Unable to apply liquid clustering to a materialized view

Hi everyone,I am trying to create a materialized view with liquid clustering using the code below. However, I realized that the query performance is slower than that of a streaming table with the same data, liquid clustering, and structure. It appear...

sebih_0-1768820672926.png
  • 105 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @sebih ,Automatic liquid clustering might not select keys for the following reasons: - The table is too small to benefit from liquid clustering.- You can apply automatic liquid clustering for all Unity Catalog managed tables, regardless of data an...

  • 1 kudos
Dhruv-22
by Contributor II
  • 228 Views
  • 3 replies
  • 0 kudos

Merge with schema evolution fails because of upper case columns

The following is a minimal reproducible example of what I'm facing right now.%sql CREATE OR REPLACE TABLE edw_nprd_aen.bronze.test_table ( id INT ); INSERT INTO edw_nprd_aen.bronze.test_table VALUES (1); SELECT * FROM edw_nprd_aen.bronze.test_tab...

Dhruv22_0-1768233514715.png Dhruv22_1-1768233551139.png Dhruv22_0-1768234077162.png
  • 228 Views
  • 3 replies
  • 0 kudos
Latest Reply
css-1029
New Contributor II
  • 0 kudos

Hi @Dhruv-22,It's actually not a bug. Let me explain what's happening.The Root CauseThe issue stems from how schema evolution works with Delta Lake's MERGE statement, combined with Spark SQL's case-insensitivity settings.Here's the key insight: spark...

  • 0 kudos
2 More Replies
JothyGanesan
by New Contributor III
  • 113 Views
  • 1 replies
  • 0 kudos

DLT Continuous Pipeline load

Hi All,In our project we are working on the DLT pipeline with the DLT tables as target running in continuous mode.These tables are common for multiple countries, and we go live in batches for different countries.So, every time a new change is request...

  • 113 Views
  • 1 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 0 kudos

@JothyGanesan Use dynamic schema handling and selective table updates to apply metadata changes incrementally from the current watermark, preserving history across country go-lives.Replace static @dlt.table definitions with Auto Loader's schema infer...

  • 0 kudos
Malthe
by Contributor III
  • 163 Views
  • 1 replies
  • 1 kudos

Resolved! Unable to update DLT-based materialized view if clustering key is missing

If we set up a materialized view with a clustering key, and then update the definition such that this key is no longer part of the table, Databricks complains:Run ALTER TABLE ... CLUSTER BY ... to repair Delta clustering metadata.But this is not poss...

  • 163 Views
  • 1 replies
  • 1 kudos
Latest Reply
K_Anudeep
Databricks Employee
  • 1 kudos

Hello @Malthe , Currently, there is no supported way to repair broken clustering metadata in Delta materialised views if you remove the clustering key from the definition, other than dropping and recreating the materialised view. Additionally, a full...

  • 1 kudos
tonkol
by New Contributor II
  • 161 Views
  • 1 replies
  • 0 kudos

Migrate on-premise delta tables to Databricks (Azure)

Hi There,I have the situation that we've decided to migrate our on-premise delta-lake to Azure Databricks.Because of networking I can only "push" the data from on-prem to cloud.What would be the best way to replicate all tables: schema+partitioning i...

  • 161 Views
  • 1 replies
  • 0 kudos
Latest Reply
mukul1409
New Contributor III
  • 0 kudos

The correct solution is not SQL based.Delta tables are defined by the contents of the delta log directory, not by CREATE TABLE statements. That is why SHOW CREATE TABLE cannot reconstruct partitions, properties or constraints.The only reliable migrat...

  • 0 kudos
hnnhhnnh
by New Contributor II
  • 248 Views
  • 1 replies
  • 0 kudos

Title: How to handle type widening (int→bigint) in DLT streaming tables without dropping the table

SetupBronze source table (external to DLT, CDF & type widening enabled):# Source table properties:# delta.enableChangeDataFeed: "true"# delta.enableDeletionVectors: "true"# delta.enableTypeWidening: "true"# delta.minReaderVersion: "3"# delta.minWrite...

  • 248 Views
  • 1 replies
  • 0 kudos
Latest Reply
mukul1409
New Contributor III
  • 0 kudos

Hi @hnnhhnnh DLT streaming tables that use apply changes do not support widening the data type of key columns such as changing an integer to a bigint after the table is created. Even though Delta and Unity Catalog support type widening in general, DL...

  • 0 kudos
JothyGanesan
by New Contributor III
  • 366 Views
  • 2 replies
  • 4 kudos

Resolved! Vacuum on DLT

We are currently using DLT tables in our target tables. The tables are getting loaded in continuous job pipelines.The liquid cluster is enabled in the tables. Will Vacuum work on these tables when it is getting loaded in continuous mode? How to run t...

  • 366 Views
  • 2 replies
  • 4 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 4 kudos

VACUUM works fine on DLT tables running in continuous mode. DLT does automatic maintenance (OPTIMIZE + VACUUM) roughly every 24 hours if the pipeline has a maintenance cluster configured. Q: The liquid cluster is enabled in the tables. Will Vacuum wo...

  • 4 kudos
1 More Replies
ismaelhenzel
by Contributor III
  • 387 Views
  • 1 replies
  • 1 kudos

Resolved! Declarative Pipelines - Dynamic Overwrite

Regarding the limitations of declarative pipelines—specifically the inability to use replaceWhere—I discovered through testing that materialized views actually support dynamic overwrites. This handles several scenarios where replaceWhere would typica...

  • 387 Views
  • 1 replies
  • 1 kudos
Latest Reply
osingh
Contributor
  • 1 kudos

This is a really interesting find, and honestly not something most people expect from materialized views.Under the hood, MVs in Databricks declarative pipelines are still Delta tables. So when you set partitionOverwriteMode=dynamic and partition by a...

  • 1 kudos
jpassaro
by New Contributor
  • 332 Views
  • 1 replies
  • 1 kudos

does databricks respect parallel vacuum setting?

I am trying to run VACUUM on a delta table that i know has millions of obselete files.out of the box, VACUUM runs the deletes in sequence on the driver. that is bad news for me!According to OSS delta docs, the setting spark.databricks.delta.vacuum.pa...

  • 332 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Greetings @jpassaro ,  Thanks for laying out the context and the links. Let me clarify what’s actually happening here and how I’d recommend moving forward. Short answer No. On Databricks Runtime, the spark.databricks.delta.vacuum.parallelDelete.enabl...

  • 1 kudos
GANAPATI_HEGDE
by New Contributor III
  • 348 Views
  • 3 replies
  • 0 kudos

Unable to configure custom compute for DLT pipeline

I am trying to configure cluster for a pipeline like above, However dlt keeps using the small cluster as usual, how to resolve this? 

GANAPATI_HEGDE_0-1762754316899.png GANAPATI_HEGDE_1-1762754398253.png
  • 348 Views
  • 3 replies
  • 0 kudos
Latest Reply
GANAPATI_HEGDE
New Contributor III
  • 0 kudos

i updated my CLI and deployed the job, still i dont see the clusters updates in  pipeline

  • 0 kudos
2 More Replies
singhanuj2803
by Contributor
  • 443 Views
  • 4 replies
  • 1 kudos

Troubleshooting Azure Databricks Cluster Pools & spot_bid_max_price Validation Error

Hope you’re doing well!I’m reaching out for some guidance on an issue I’ve encountered while setting up Azure Databricks Cluster Pools to reduce cluster spin-up and scale times for our jobs.Background:To optimize job execution wait times, I’ve create...

  • 443 Views
  • 4 replies
  • 1 kudos
Latest Reply
Poorva21
New Contributor III
  • 1 kudos

Possible reasons:1. Setting spot_bid_max_price = -1 is not accepted by Azure poolsAzure Databricks only accepts:0 → on-demand onlypositive numbers → max spot price-1 is allowed in cluster policies, but not inside pools, so validation never completes....

  • 1 kudos
3 More Replies
Labels