cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

antr
by New Contributor II
  • 661 Views
  • 3 replies
  • 0 kudos

DLT full refresh and resets

When doing a full refresh in DLT, the ables seem to be in a reset/empty state until they're populated again. This can break downstream dependencies, if they try to use the data during pipeline execution.How to handle such case properly?

  • 661 Views
  • 3 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @antr! In DLT, a full refresh on a streaming table resets state processing and checkpoint data, potentially disrupting downstream processes that rely on it. To avoid this, use incremental updates (default) or append mode instead of full refresh...

  • 0 kudos
2 More Replies
Shivap
by New Contributor III
  • 346 Views
  • 3 replies
  • 0 kudos

Need to extract data from delta tables and need to move it to on-prem, what's the best approach

I want to extract data from databricks delta tables and need to move it to on-prem what's the best way to accomplish it -

  • 346 Views
  • 3 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Valued Contributor II
  • 0 kudos

An easy way to do is to use Airbyte. You can run Airbyte locally, connect to databricks and copy the data to your OnPrem location.https://docs.airbyte.com/integrations/destinations/databricks

  • 0 kudos
2 More Replies
Rasputin312
by New Contributor II
  • 827 Views
  • 1 replies
  • 1 kudos

Resolved! Widgets Not Displaying

I am trying to run this attention visualization in my Databricks notebook.   This is my code and this is the error I get:```from IPython.display import display, Javascriptimport ipywidgets as widgetsfrom ipywidgets import interactfrom transformers im...

  • 827 Views
  • 1 replies
  • 1 kudos
Latest Reply
koji_kawamura
Databricks Employee
  • 1 kudos

Hi @Rasputin312 ! I was able to render the visualization with bertviz library. The default moview_view html_action is view that does not work with Databricks notebook. Instead, using the returned HTML, we can visualize the model. display(model_view(a...

  • 1 kudos
Kayla
by Valued Contributor II
  • 515 Views
  • 2 replies
  • 2 kudos

Resolved! Scheduled Workflow options and DST Change

So, I have a workflow that runs 2:35 am daily.Is there really no way to configure that so it isn't completely skipped during the spring time change?

  • 515 Views
  • 2 replies
  • 2 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 2 kudos

Hi @Kayla ,i suggest best solution would be to use UTC. Even databricks recommends that.Or shifting the job 30mins - 1 hr

  • 2 kudos
1 More Replies
ramy
by New Contributor II
  • 2929 Views
  • 5 replies
  • 2 kudos

Getting JOB-ID dynamically to create another job to refer as job-task

I am trying to create a new job in Databricks Asset Bundles which refers to another job-task and passing parameters to it. However, the previous job is not created yet (Or will be cretead using Databricks asset bundles in higher envs when deploying t...

  • 2929 Views
  • 5 replies
  • 2 kudos
Latest Reply
priya12
New Contributor II
  • 2 kudos

The lookup works. Here is how it can be used for a job existing outside the asset bundlevariables: my_jobid:    description: Enter the Databricks Job name you want to refer.    lookup:      job: 'My Job1'In the resources section, refer...

  • 2 kudos
4 More Replies
asisaarav
by New Contributor
  • 317 Views
  • 1 replies
  • 0 kudos

Error : The spark driver has stopped unexpectedly and is restarting

Hi community,Getting an error in the code: Error : The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically restarted. Cancel you help here in understanding what methods we can use to get it fixed. I tried look...

  • 317 Views
  • 1 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor
  • 0 kudos

The error message indicates an issue with the Spark driver in your Databricks environment. This can be caused by various factors such as:Check Cluster Configuration: Ensure that your Databricks cluster has sufficient resources (CPU, memory) to handle...

  • 0 kudos
giladba
by New Contributor III
  • 6679 Views
  • 12 replies
  • 11 kudos

access to event_log TVF

Hi, According to the documentation:https://docs.databricks.com/en/delta-live-tables/observability.html"The event_log TVF can be called only by the pipeline owner and a view created over the event_log TVF can be queried only by the pipeline owner. The...

  • 6679 Views
  • 12 replies
  • 11 kudos
Latest Reply
larsbbb
New Contributor III
  • 11 kudos

@LakehouseGuy  @mkEngineer @hcjp @neha_ayodhya I just saw the following option in dlt pipelines! I haven't testing it yet, but it looks promising.This also looks new documentation:https://learn.microsoft.com/en-us/azure/databricks/dlt/observability#q...

  • 11 kudos
11 More Replies
sandeepmankikar
by New Contributor III
  • 261 Views
  • 1 replies
  • 0 kudos

Complex Embedded Workflows

Can complex embedded workflows be created using Databricks Bundle, where multiple workflows are interconnected in a parent-child format? If Databricks Bundle doesn't support this, what would be the best alternative for creating and deploying such wor...

  • 261 Views
  • 1 replies
  • 0 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 0 kudos

Yup you can create , complex workflows as well in databricks bundles , some examples can beYou can have all of them defined from child to parent and can call those chirl workflows as workflow task in the parent workflows, all different kinds of tasks...

  • 0 kudos
ShivangiB
by New Contributor III
  • 539 Views
  • 3 replies
  • 0 kudos

Liquid Clustering limitation clustering on write does not support source queries that include filter

I have a query :%sqlinsert into ucdata.brz.liquidcluster_table_data select sum(col1) as col1,col2,sum(col3) as col3 from ucdata.brz.liquidcluster_table_data  group by col2This query I am running with run time version 13.3 and it is still working. But...

  • 539 Views
  • 3 replies
  • 0 kudos
Latest Reply
ShivangiB
New Contributor III
  • 0 kudos

Hey Team, can you please help on this

  • 0 kudos
2 More Replies
Datanoob123
by New Contributor II
  • 1003 Views
  • 6 replies
  • 1 kudos

Query to show column names in common between multiple tables

Hi all, I have a large amount of tables that I would like a query to pull the column names present in these tables that are common between all the tables.  I know about show columns, but can't seem to use this or another method to achieve this. This ...

Data Engineering
comparing tables
show columns
sql
  • 1003 Views
  • 6 replies
  • 1 kudos
Latest Reply
KaranamS
Contributor III
  • 1 kudos

Hi @Datanoob123 ,I agree with @Stefan-Koch! It could be that you don't have access to the system tables. Please reach out to your Databricks Admin to grant you required permissions to system tables. You can then use the query I shared to get the requ...

  • 1 kudos
5 More Replies
HarryRichard08
by New Contributor II
  • 827 Views
  • 3 replies
  • 0 kudos

Unable to Access S3 from Serverless but Works on Cluster

Hi everyone,I am trying to access data from S3 using an access key and secret. When I run the code through Databricks clusters, it works fine. However, when I try to do the same from a serverless cluster , I am unable to access the data.I have alread...

  • 827 Views
  • 3 replies
  • 0 kudos
Latest Reply
KaranamS
Contributor III
  • 0 kudos

Hi @HarryRichard08, Databricks recommends using instance profiles (IAM roles) to connect to AWS S3 as they provide a secure and scalable method without embedding credentials in a notebook. Have you tried this approach?https://docs.databricks.com/aws/...

  • 0 kudos
2 More Replies
Dave_Nithio
by Contributor II
  • 641 Views
  • 0 replies
  • 0 kudos

Transaction Log Failed Integrity Checks

I have started to receive the following error message - that the transaction log has failed integrity checks - when attempting to optimize and run compaction on a table. It also occurs when I attempt to alter this table.This blocks my pipeline from r...

Dave_Nithio_1-1741718129217.png
  • 641 Views
  • 0 replies
  • 0 kudos
NUKSY
by New Contributor II
  • 560 Views
  • 3 replies
  • 0 kudos

`io.unitycatalog.client.model.TableType`, Unexpected value 'MATERIALIZED_VIEW

I have been able to set up jdbc driver with databricks to connect to my unity catalog using local spark sessions. When i try to retrieve tables in my schema i get this error  An error occurred while calling o43.sql.: io.unitycatalog.client.ApiExcepti...

  • 560 Views
  • 3 replies
  • 0 kudos
Latest Reply
Jofes
New Contributor II
  • 0 kudos

I am getting the same but for MANAGED_SHALLOW_CLONE tables:An error occurred while calling o47.sql. : io.unitycatalog.client.ApiException: com.fasterxml.jackson.databind.exc.ValueInstantiationException: Cannot construct instance of io.unitycatalog.cl...

  • 0 kudos
2 More Replies
HoussemBL
by New Contributor III
  • 698 Views
  • 2 replies
  • 0 kudos

Databricks asset bundle deploys DLT pipelines as duplicate resources

Dear Community,I have a deployment issue after restructuring my project.Previously, our project was organized with the following structure:project/src/project/resources/project/databricks.ymlAs part of an optimization effort, we have transitioned to ...

  • 698 Views
  • 2 replies
  • 0 kudos
Latest Reply
HoussemBL
New Contributor III
  • 0 kudos

Hi @ashraf1395 ,I am creating two separate databricks.yml for each sub-project.

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels