Data Engineering

Forum Posts

Sorted by:

by Ashok_Vengala • New Contributor

Thursday

84 Views
1 replies
0 kudos

Unable to Add Multiple Columns in Single ALTER TABLE Statement on Iceberg Table via Unity REST Catal

Hello Databricks Team,I have implemented code to integrate the Iceberg Unity REST Catalog with the Teradata OTF engine and successfully performed read and write operations, following the documentation at https://docs.databricks.com/aws/en/external-ac...

Data Engineering

84 Views
1 replies
0 kudos

Thursday

View Replies

Latest Reply

nayan_wylde
Honored Contributor III

Thursday

0 kudos

This error stems from the Iceberg table metadata update constraints enforced by the Unity Catalog's REST API. Specifically, the Iceberg REST Catalog currently does not support multiple schema changes in a single commit. Each ALTER TABLE operation tha...

0 kudos

Thursday

by TejeshS • Contributor

01-03-2025 6:09:11 AM

3300 Views
3 replies
1 kudos

How to identify which columns we need to consider for liquid clustering from a table of 200+ columns

In Databricks, when working with a table that has a large number of columns (e.g., 200), it can be challenging to determine which columns are most important for liquid clustering.Objective: The goal is to determine which columns to select based on th...

Data Engineering

3300 Views
3 replies
1 kudos

01-03-2025 6:09:11 AM

View Replies

Latest Reply

noorbasha534
Valued Contributor II

07-22-2025 2:59:11 PM

1 kudos

@Alberto_Umana is it possible to get from system table the columns used in joins & filters of a table being queried?

1 kudos

07-22-2025 2:59:11 PM

2 More Replies

by Alby091 • New Contributor

01-07-2025 12:59:45 PM

1298 Views
2 replies
0 kudos

Multiple schedules in workflow with different parameters

I have a notebook that takes a file from the landing, processes it and saves a delta table.This notebook contains a parameter (time_prm) that allows you to do this option for the different versions of files that arrive every day.Specifically, for eac...

Data Engineering

parameters

Workflows

1298 Views
2 replies
0 kudos

01-07-2025 12:59:45 PM

View Replies

Latest Reply

ImranA
Contributor

Thursday

0 kudos

You can do multiple schedules with Cron expression. If you are using a Cron expression in Databricks asset bundle YAML, but the limitation is you can't have one running at 0 past the hour and another at 25 past.i.e: quartz_cron_expression: 0 45 9,23 ...

0 kudos

Thursday

1 More Replies

by Spenyo • New Contributor II

04-08-2024 12:50:42 AM

1615 Views
1 replies
1 kudos

Delta table size not shrinking after Vacuum

Hi team.Everyday once we overwrite the last X month data in tables. So it generate a every day a larger amount of history. We don't use time travel so we don't need it.What we done:SET spark.databricks.delta.retentionDurationCheck.enabled = false ALT...

Data Engineering

vacuum

1615 Views
1 replies
1 kudos

04-08-2024 12:50:42 AM

View Replies

Latest Reply

pabloaschieri
New Contributor

Thursday

1 kudos

Hi, any update on this? Thanks

1 kudos

Thursday

by vamsi_simbus • New Contributor III

Tuesday

183 Views
2 replies
1 kudos

Migrating Talend ETL Jobs to Databricks – Best Practices & Challenges

Hi All,I’m currently working on a Proof of Concept (POC) to migrate existing Talend ETL jobs to Databricks. The goal is to leverage Databricks for data processing and orchestration while moving away from Talend.I’d appreciate insights on the followin...

Data Engineering

migration

Talend

183 Views
2 replies
1 kudos

Tuesday

View Replies

Latest Reply

vamsi_simbus
New Contributor III

Thursday

1 kudos

@AbhaySingh Thank you for your insights.

1 kudos

Thursday

1 More Replies

by fjrodriguez • New Contributor III

4 weeks ago

423 Views
2 replies
1 kudos

Resolved! Ingestion Framework

I would to like to update my ingestion framework that is orchestrated by ADF, running couples Databricks notebook and copying the data to DB afterwards. I want to rely everything on Databricks i though this could be the design:Step 1. Expose target t...

Data Engineering

423 Views
2 replies
1 kudos

4 weeks ago

View Replies

Latest Reply

fjrodriguez
New Contributor III

Thursday

1 kudos

Hey @saurabh18cs , It is taking longer than expected to expose Azure SQL tables in UC. I can do that through Foreign Catalog but this is not what i want due to is read-only. As far i can see external connection is for cloud object storage paths (ADLS...

1 kudos

Thursday

1 More Replies

by saicharandeepb • New Contributor III

Tuesday

164 Views
4 replies
1 kudos

How to Retrieve DBU Count per Compute Type for Accurate Cost Calculation?

Hello Everyone,We are currently working on a cost analysis initiative to gain deeper insights into our Databricks usage. As part of this effort, we are trying to calculate the hourly cost of each Databricks compute instance by utilizing the Azure Ret...

Data Engineering

164 Views
4 replies
1 kudos

Tuesday

View Replies

Latest Reply

saicharandeepb
New Contributor III

Wednesday

1 kudos

Hi everyone, just to clarify my question — I’m looking for the DBU count per compute type (per instance type), not the total DBU consumption per workload.In other words, I want to know the fixed DBU rate assigned to each compute SKU (for example, DS3...

1 kudos

Wednesday

3 More Replies

by Jpeterson • New Contributor III

11-04-2022 2:21:09 PM

5686 Views
8 replies
4 kudos

Databricks SQL Warehouse, Tableau and spark.driver.maxResultSize error

I'm attempting to create a tableau extract on tableau server with a connection to databricks large sql warehouse. The extract process fails due to spark.driver.maxResultSize error.Using a databricks interactive cluster in the data science & engineer...

Data Engineering

5686 Views
8 replies
4 kudos

11-04-2022 2:21:09 PM

View Replies

Latest Reply

Oliverarson
New Contributor

Wednesday

4 kudos

It sounds like you're running into quite a frustrating issue with Databricks and Tableau! Adjusting the spark.driver.maxResultSize is a good idea, but if you're still facing challenges, consider streamlining your data selections or aggregating your r...

4 kudos

Wednesday

7 More Replies

by Rjdudley • Honored Contributor

09-10-2025 6:20:07 AM

318 Views
3 replies
0 kudos

Resolved! AUTO CDC API and sequence column

The docs for AUTO CDC API stateYou must specify a column in the source data on which to sequence records, which Lakeflow Declarative Pipelines interprets as a monotonically increasing representation of the proper ordering of the source data.Can this ...

Data Engineering

318 Views
3 replies
0 kudos

09-10-2025 6:20:07 AM

View Replies

Latest Reply

Rjdudley
Honored Contributor

Wednesday

0 kudos

Thanks Szymon, I'm familiar with the Postgre SQL implementation and was hoping Databricks would behave the same.

0 kudos

Wednesday

2 More Replies

by ankit001mittal • New Contributor III

06-12-2025 11:17:09 PM

1881 Views
1 replies
2 kudos

DLT schema evolution/changes in the logs

Hi all,I want to figure out how to find when the schema evolution/changes are happening in the objects in DLT pipelines through the DLT logs.Could you please share some sample DLT logs which explains about the schema changes?Thank you for your help.

Data Engineering

1881 Views
1 replies
2 kudos

06-12-2025 11:17:09 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

Wednesday

2 kudos

To find when schema evolution or changes are happening in objects within DLT (Delta Live Table) pipelines, you need to monitor certain entries within the DLT logs or Delta transaction logs that signal modifications to the underlying schema of a table...

2 kudos

Wednesday

by minhhung0507 • Valued Contributor

05-22-2025 2:48:25 AM

2543 Views
3 replies
0 kudos

DLT Flow Failed Due to Missing Flow Checkpoints Directory When Using Unity Catalog

I’m encountering an issue while running a Delta Live Tables (DLT) pipeline that is managed using Unity Catalog on Databricks. The pipeline has failed and is not restarting, showing the following error:java.lang.IllegalArgumentException: flow checkpoi...

Data Engineering

2543 Views
3 replies
0 kudos

05-22-2025 2:48:25 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

Wednesday

0 kudos

The best practices for setting up checkpointing in Delta Live Tables (DLT) pipelines when using Unity Catalog are largely centered on leveraging Databricks' managed services, adhering to Unity Catalog's table management conventions, and minimizing th...

0 kudos

Wednesday

2 More Replies

by sumitkumar_284 • New Contributor II

2 weeks ago

353 Views
4 replies
1 kudos

Not able to refresh powerbi dashboar form databricks jobs

I am trying to refresh Power BI Dashboard using Databricks jobs and constantly getting this error, but I am providing optional parameters which includes catalog and database. Also, things to note that I am able to do refresh on Power BI UI using both...

Data Engineering

353 Views
4 replies
1 kudos

2 weeks ago

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

2 weeks ago

1 kudos

Hi @sumitkumar_284 ,Can you provide us more details? Are you using Unity Catalog? Which authentication mechanism you have? In which version of Power BI Desktop you've developed your semantic model/dashboard? Do you meet all below requirements?Publish...

1 kudos

2 weeks ago

3 More Replies

by hf-databricks • New Contributor

Wednesday

84 Views
1 replies
0 kudos

Unable to create workspace

Hi Team,we have challenge creating workspace in data bricks account created on top of aws.below are the details:Databricks account name : saichaitanya.vaddadhi@healthfirsttech.com's LakehouseAWS Account id : 720016114009databricks id: 1ee8765f-b472-4...

Data Engineering

84 Views
1 replies
0 kudos

Wednesday

View Replies

Latest Reply

BS_THE_ANALYST
Esteemed Contributor II

Wednesday

0 kudos

@hf-databricks there's a quickstart guide for creating a workspace with AWS: https://docs.databricks.com/aws/en/admin/workspace/quick-start There's a list of requirements:There's more options for creating workspaces. Above, I just listed the recommen...

0 kudos

Wednesday

by der • Contributor

Wednesday

201 Views
8 replies
0 kudos

Rasterio on shared/standard cluster has no access to proj.db

We try to use rasterio on a Databricks shared/standard cluster with DBR 17.1. Rasterio is directly installed on the cluster as library. Code:import rasterio rasterio.show_versions()Output: rasterio info:rasterio: 1.4.3GDAL: 3.9.3PROJ: 9.4.1GEOS: 3.11...

Data Engineering

201 Views
8 replies
0 kudos

Wednesday

View Replies

Latest Reply

Chiran-Gajula
New Contributor

Wednesday

0 kudos

Hi @der Can you try adding this in your test script.import osos.environ["PROJ_LIB"]="/databricks/native/proj-data"Hope users have access to this path /databricks/native/proj-data

0 kudos

Wednesday

7 More Replies

by maninegi05 • New Contributor II

a week ago

298 Views
3 replies
1 kudos

DLT Pipeline Stopped working

Hello, Suddenly our DLT pipelines we're getting failures saying thatLookupError: Traceback (most recent call last): result_df = result_df.withColumn("input_file_path", col("_metadata.file_path")).withColumn( ...

Data Engineering

298 Views
3 replies
1 kudos

a week ago

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

Wednesday

1 kudos

Greetings @maninegi05 , I did some digging internally and I believe some recent changes to the DLT image may be to blame. We are aware of regression issue and are actively working to address them. TL/DR Why you might see “LookupError: ContextVar 'par...

1 kudos

Wednesday

2 More Replies

Databricks Community

Forum Posts

Unable to Add Multiple Columns in Single ALTER TABLE Statement on Iceberg Table via Unity REST Catal

How to identify which columns we need to consider for liquid clustering from a table of 200+ columns

Multiple schedules in workflow with different parameters

Delta table size not shrinking after Vacuum

Migrating Talend ETL Jobs to Databricks – Best Practices & Challenges

Resolved! Ingestion Framework

How to Retrieve DBU Count per Compute Type for Accurate Cost Calculation?

Databricks SQL Warehouse, Tableau and spark.driver.maxResultSize error

Resolved! AUTO CDC API and sequence column

DLT schema evolution/changes in the logs

DLT Flow Failed Due to Missing Flow Checkpoints Directory When Using Unity Catalog

Not able to refresh powerbi dashboar form databricks jobs

Unable to create workspace

Rasterio on shared/standard cluster has no access to proj.db

DLT Pipeline Stopped working

Join Us as a Local Community Builder!

Pass parameters between jobs

[NUMERIC_VALUE_OUT_OF_RANGE.WITHOUT_SUGGESTION] T...

Notebook Session Has Crashed

Why am I getting NameError name _all_timezones_unc...

Lakeflow Connect SchemaParseException: Illegal cha...