cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

shrutikatyal
by New Contributor III
  • 4684 Views
  • 9 replies
  • 2 kudos

Resolved! commit time is coming as null in autoloader

As per the databricks new feature in autoloader that we can use archival and move feature in autoloader however I am trying to use that feature using databricks 16.4.x.scala2.12 however commit time is still coming null as its mentioned in the documen...

  • 4684 Views
  • 9 replies
  • 2 kudos
Latest Reply
TheOC
Contributor III
  • 2 kudos

Hey @shrutikatyal I believe the only current route to get a discount voucher would be the following:https://community.databricks.com/t5/events/dais-2025-virtual-learning-festival-11-june-02-july-2025/ev-p/119323I think it’s the last day of the event ...

  • 2 kudos
8 More Replies
MinuN
by New Contributor
  • 2895 Views
  • 1 replies
  • 0 kudos

Handling Merged Heading Rows When Converting Excel to CSV in Databricks

Hi all,I'm working on a process in Databricks to convert multiple Excel files to CSV format. These Excel files follow a similar structure but with some variations. Here's the situation:Each file contains two header rows:The first row contains merged ...

  • 2895 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi MinuN,How are you doing today?That’s a great question, and you're definitely on the right path using BeautifulSoup to extract the table structure from .xls HTML-like files. To generate the repeated first row of main headings for the CSV, one pract...

  • 0 kudos
sridharplv
by Valued Contributor II
  • 1728 Views
  • 1 replies
  • 1 kudos

Need help on "You cannot enable Iceberg reads on materialized views and streaming tables"

Hi All, As we  "cannot enable Iceberg reads on materialized views and streaming tables", Is there any option in private preview to enable Iceberg reads for materialized views and streaming tables. I tried using the option of DLT Sink API with table c...

  • 1728 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi sridharplv,How are you doing today? As per my understanding, Databricks does not support Iceberg reads for materialized views and streaming tables, and there’s no official preview or timeline shared publicly for enabling this support. Your workaro...

  • 1 kudos
ddundovic
by New Contributor III
  • 6537 Views
  • 2 replies
  • 1 kudos

Resolved! Lookup dashboard ID in bundle variables

Hi all,I have an asset bundle that contains the following dashboard_task:resources: jobs: my_job: name: my_job_name tasks: - task_key: refresh_my_dashboard dashboard_task: dashboard_id: ${var.my_dashbo...

  • 6537 Views
  • 2 replies
  • 1 kudos
Latest Reply
ddundovic
New Contributor III
  • 1 kudos

Thanks! That does make sense. When I run `databricks lakeview list` I do get the dashboard I want:[ { "create_time": "2025-06-23T08:09:49.595Z", "dashboard_id": "id000000000000000000000", "display_name": "My_Dashboard_Name", "lifecy...

  • 1 kudos
1 More Replies
varni
by New Contributor III
  • 1373 Views
  • 1 replies
  • 0 kudos

Widget value not synchronized after detach/reattach

Hello Databricks Team,I hope you are doing well.I’m working with dbutils.widgets in a Databricks notebook using the Accessed Commands mode, and I have encountered some challenges.Specifically, after detaching and reattaching to the cluster:- the widg...

  • 1373 Views
  • 1 replies
  • 0 kudos
Latest Reply
Khaja_Zaffer
Contributor III
  • 0 kudos

Hello there Can you please share the code used for widgets?also if you change mannually, is it working? (did it worked before?) Are you trying to load via some parent notebook?Waiting for your response. 

  • 0 kudos
JameDavi_51481
by Contributor
  • 1624 Views
  • 1 replies
  • 0 kudos

making REORG TABLE to enable Iceberg Uniform more efficient and faster

I am upgrading a large number of tables for Iceberg / Uniform compatibility by running REORG TABLE <tablename> APPLY (UPGRADE UNIFORM(ICEBERG_COMPAT_VERSION=2));and finding that some tables take several hours to upgrade - presumably because they are ...

  • 1624 Views
  • 1 replies
  • 0 kudos
Latest Reply
sridharplv
Valued Contributor II
  • 0 kudos

HI @JameDavi_51481 , Hope you tried this approach for enabling iceberg metadata along with delta format :ALTER TABLE internal_poc_iceberg.iceberg_poc.clickstream_gold_sink_dltSET TBLPROPERTIES ('delta.columnMapping.mode' = 'name','delta.enableIceberg...

  • 0 kudos
vsam
by New Contributor II
  • 2472 Views
  • 5 replies
  • 2 kudos

Optimize taking FULL Taking Longer time on Clustered Table

Hi Everyone, Currently we are facing issue with OPTIMIZE table_name FULL operation. The dataset consists of 150 billion rows of data and it takes 8 hours to optimize the reloaded clustered table. The table is refreshed every month and it needs cluste...

  • 2472 Views
  • 5 replies
  • 2 kudos
Latest Reply
sridharplv
Valued Contributor II
  • 2 kudos

Hi @vsam , Have you tried the Auto liquid clustering with Predictive optimization enabled where you don't need to mention cluster by columns specifically and also the optimization will be handled in the backend by predictive optimization concept.http...

  • 2 kudos
4 More Replies
Sneeze7432
by New Contributor III
  • 4190 Views
  • 13 replies
  • 2 kudos

File Trigger Not Triggering Multiple Runs

I have a job with one task which is to run a notebook.  The job run is setup with a File arrival trigger with my blob storage as the location.  The trigger works and the job will start when a new file arrives in the location, but it does not run for ...

  • 4190 Views
  • 13 replies
  • 2 kudos
Latest Reply
nayan_wylde
Esteemed Contributor
  • 2 kudos

@Sneeze7432 you can also try editing the max concurrent runs in the workflow. 

  • 2 kudos
12 More Replies
glevin1
by New Contributor
  • 2499 Views
  • 1 replies
  • 0 kudos

API response code when running a new job

We are attempting to use the POST /api/2.2/jobs/run-now endpoint using oAuth 2.0 Client Credentials authentication.We are finding that when sending a request with an expired token, we receive a HTTP code of 400. This contradicts the documentation on ...

  • 2499 Views
  • 1 replies
  • 0 kudos
Latest Reply
Khaja_Zaffer
Contributor III
  • 0 kudos

Hello gelvinPlease raise the ticket using this lik  https://help.databricks.com/s/contact-us?ReqType=training Please explain the issue clearly so that it will be easy for supoort team to help easily.

  • 0 kudos
carolregatt
by New Contributor II
  • 2814 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks Asset Bundle wrongfully deleting job

Hey so Ive just started to use DAB to automatically mange job configs via CICD I had a previously-existing job (lets say ID 123) which was created manually and had this config resources:  jobs:    My_Job_A:      name: My Job A And I wanted to automat...

  • 2814 Views
  • 2 replies
  • 1 kudos
Latest Reply
carolregatt
New Contributor II
  • 1 kudos

Thanks so much for the response @Advika !That makes sense!Can you explain why the remote config had a different key when compared to the local one? I guess that was what threw me off and made me want to change the local key to match the remote

  • 1 kudos
1 More Replies
Hoviedo
by New Contributor III
  • 1462 Views
  • 4 replies
  • 0 kudos

Apply expectations only if column exists

Hi, is there any way to apply a expectations only if that column exists? I am creating multiple dlt tables with the same python function so i would like to create diferent expectations based in the table name, currently i only can create expectations...

  • 1462 Views
  • 4 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

To apply expectations only if a column exists in Delta Live Tables (DLT), you can use the @Dlt.expect decorator conditionally within your Python function. Here is a step-by-step approach to achieve this: Check if the Column Exists: Before applying th...

  • 0 kudos
3 More Replies
Arturo_Franco
by New Contributor II
  • 1064 Views
  • 2 replies
  • 0 kudos

I cant find Data Engineering Associate Resources

Im taking the self-paced Databricks Data Engineering Associate course. Where I can find the repo link to the repo is shared throughout the course?

  • 1064 Views
  • 2 replies
  • 0 kudos
Latest Reply
Arturo_Franco
New Contributor II
  • 0 kudos

Hi, @Advika thanks for the reponse. Im currenlty usign the partner databricks account, dont I have access to the resources with this partner subscription?

  • 0 kudos
1 More Replies
eballinger
by Contributor
  • 2268 Views
  • 2 replies
  • 1 kudos

Resolved! Problem with SHOW GROUPS command

We have 3 environments (dev, qc and prod). In DEV and QC I can issue this command:SHOW GROUPS from a sql notebook and it will show the group I have created for each workspace. However in production this group is not displayed. This group in all 3 cas...

  • 2268 Views
  • 2 replies
  • 1 kudos
Latest Reply
eballinger
Contributor
  • 1 kudos

Thanks SP_6721, That was exactly my issue. all good now. Have a good day

  • 1 kudos
1 More Replies
joao_augusto
by New Contributor III
  • 1201 Views
  • 1 replies
  • 0 kudos

The warehouse fails to start

Hi, everyone!Does anyone know the reason for this problem? It says that I do not need to do anything, but if I don't restart the warehouse manually, it will not start.Is there a way to fix it? Or at least, create monitoring for it? We have some jobs ...

unnamed.png
  • 1201 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @joao_augusto , Maybe you're hitting some quotas on AWS EC2 instances? Could you check it?Regarding monitoring, you can try to use API calls and check health status of you sql warehouses:List warehouses | SQL Warehouses API | REST API reference | ...

  • 0 kudos
lizou1
by New Contributor III
  • 1972 Views
  • 5 replies
  • 1 kudos

serverless job compute error

My general question:Does a serverless compute job automatically scale?The reason I try serverless job - with Performance optimization Disabled option is to make job run effortless and cost effective.I don't like to do any tuning on spark at all. I di...

lizou1_0-1751579071982.png
  • 1972 Views
  • 5 replies
  • 1 kudos
Latest Reply
lizou1
New Contributor III
  • 1 kudos

I found a setting about 16 G vs 32 g, but that's is part of memory used by sparkhttps://learn.microsoft.com/en-us/azure/databricks/compute/serverless/dependencies#high-memoryIf you run into out-of-memory errors in your notebook, you can configure the...

  • 1 kudos
4 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels