cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

manish1987c
by New Contributor III
  • 44 Views
  • 2 replies
  • 0 kudos

Delta Live Table - Flow detected an update or delete to one or more rows in the source table

I have create a pipeline where i am ingesting the data from bronze to silver and using SCD 1, however when i am trying to create gold table as dlt it is giving me error as "Flow 'user_silver' has FAILED fatally. An error occurred because we detected ...

manish1987c_0-1718341166099.png manish1987c_1-1718341206991.png
  • 44 Views
  • 2 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

@manish1987c -The Streaming does not handle input that is not an append. you can set skipChangeCommits to true 

  • 0 kudos
1 More Replies
MR07
by Visitor
  • 69 Views
  • 2 replies
  • 0 kudos

Databricks Managing Materialized Views in Delta Live Tables: Selective Refresh Behavior

Hi Community,I have 200 complex SQL Queries and I can't create a Streaming tables using these queries. So, I have created as Materialized Views in Delta Live Tables using these SQL queries and the DLT pipeline should be run continuously.My question i...

  • 69 Views
  • 2 replies
  • 0 kudos
Latest Reply
steyler-db
New Contributor III
  • 0 kudos

Hello team, thanks for reaching out us, it will be a pleasure to help you on this ask. That's a great catch to run through a materialized, view and regarding the question: If any record of underlying table is inserted, updated or deleted, the only re...

  • 0 kudos
1 More Replies
chevichenk
by New Contributor II
  • 65 Views
  • 3 replies
  • 2 kudos

No userid, username, job when making modifications on tables

Hi, everyone!I'm in this situationI have some jobs that makes changes on a particular table. I use only one user to make this modifications, but then there's a process i can't identify that also makes changes on my table.The question is, there's a re...

chevichenk_1-1718308350095.png
Data Engineering
history
jobs
userid
username
  • 65 Views
  • 3 replies
  • 2 kudos
Latest Reply
chevichenk
New Contributor II
  • 2 kudos

Hi, @shan_chandra, @LuisRSanchez,Just found that there are some .jar that are executed and are writting on this table but this .jar is called through batchSo, we think this is the cause Thanks!Ingrid

  • 2 kudos
2 More Replies
avrm91
by New Contributor III
  • 51 Views
  • 1 replies
  • 0 kudos

How to load xlsx Files to Delta Live Tables (DLT)?

I want to load a .xlsx file to DLT but struggling as it is not available with Autoloader.With the Assistant I tried to load the .xlsx first to a data frame and then send it to DLT.  import dlt from pyspark.sql import SparkSession # Load xlsx file in...

  • 51 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

@avrm91  - can try dividing xlsx files into a csv as a preprocessing step and ingest them in to a dataframe using Autoloader. Also, you can use openpyxl to load into a dataframe. refer to this doc for example.  

  • 0 kudos
JeremyH
by New Contributor II
  • 48 Views
  • 3 replies
  • 0 kudos

CREATE WIDGETS in SQL Notebook attached to SQL Warehouse Doesn't Work.

I'm able to create and use widgets using the UI in my SQL notebooks, but they get lost quite frequently when the notebook is reset.There is documentation suggesting we can create widgets in code in SQL: https://learn.microsoft.com/en-us/azure/databri...

  • 48 Views
  • 3 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

Hi @JeremyH - can you please try adding like the below in your query and see if widgets are getting populated? {{parameter_name }}

  • 0 kudos
2 More Replies
Jackson1111
by New Contributor II
  • 44 Views
  • 1 replies
  • 0 kudos

Databricks job cluster logs

Hello, how can I enable Databricks to generate a separate spark log for each job run?What parameters should I use in spark configuration? 

  • 44 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

@Jackson1111 - If you are talking about workflow jobs, you can try running using a job cluster to generate spark logs for a each of the workflow jobs.  But, If this is of Spark Jobs within the Spark UI, you wanted to separate out the logs. This is a ...

  • 0 kudos
semsim
by New Contributor III
  • 17 Views
  • 0 replies
  • 0 kudos

Init Script Failing

I am getting an error when I try to run the cluster scoped init script. The script itself is as follows:#!/bin/bashsudo apt update && sudo apt upgrade -ysudo apt install libreoffice-common libreoffice-java-common libreoffice-writer openjdk-8-jre-head...

  • 17 Views
  • 0 replies
  • 0 kudos
aozero
by Visitor
  • 48 Views
  • 1 replies
  • 0 kudos

Deleting data programmatically from databricks live delta tables

Hello all, I am relatively new in data engineering and working on a project requiring me to programmatically delete data from delta live tables. However, I found that simply stopping the streaming job and deleting rows from the delta tables caused th...

  • 48 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

@aozero - can you please try a FULL REFRESH of the Delta live tables? https://docs.databricks.com/en/delta-live-tables/updates.html#how-delta-live-tables-updates-tables-and-views  

  • 0 kudos
Nathant93
by New Contributor II
  • 33 Views
  • 0 replies
  • 0 kudos

Unzipping with Serverless Compute

HiI have started using serverless compute but have come across the limitation that I cannot use the local filesystem for temporarily storing the files and directories before moving them to where they need to be in adls.Does anyone have a way of unzip...

Data Engineering
serverless
unzip
  • 33 Views
  • 0 replies
  • 0 kudos
laudhon
by New Contributor
  • 128 Views
  • 2 replies
  • 0 kudos

Why is My MIN MAX Query Still Slow on a 29TB Delta Table After Liquid Clustering and Optimization?

Hello,I have a large Delta table with a size of 29TB. I implemented Liquid Clustering on this table, but running a simple MIN MAX query on the set cluster column is still extremely slow. I have already optimized the table. Am I missing something in m...

  • 128 Views
  • 2 replies
  • 0 kudos
Latest Reply
LuisRSanchez
New Contributor III
  • 0 kudos

Hithis operation should take seconds because it use the precomputed statistics for the table. Then few elements to verify:if the data type is datetime or integer should work, if it is string data type then it needs to read all data.verify the column ...

  • 0 kudos
1 More Replies
thiagoawstest
by New Contributor III
  • 102 Views
  • 6 replies
  • 0 kudos

Resolved! Azure Devops CI/CD - AWS Databricks

Hello, there is documentation for integrating Azure Devops CI/CD pipeline with AWS Databricks.Thanks.

Data Engineering
aws devops
  • 102 Views
  • 6 replies
  • 0 kudos
Latest Reply
jacovangelder
Contributor II
  • 0 kudos

You'll need to install the AWS Toolkit in Azure DevOps, that way you can make a service connection inside your Azure DevOps project that authenticates using an AWS Access Key ID/Secret Access Key (the AWS equivalent to Azure Service Principals). Hope...

  • 0 kudos
5 More Replies
naga_databricks
by Contributor
  • 37 Views
  • 1 replies
  • 0 kudos

Overwriting same table

I have a table A that is used in a spark.sql and joins with multiple other tables to get data. this data will be overwritten to the same table A.When i tried this, i get an error consistently as below: ERROR: An error occurred while calling o382.save...

  • 37 Views
  • 1 replies
  • 0 kudos
Latest Reply
naga_databricks
Contributor
  • 0 kudos

Found this to be a transient error. Once i restarted the cluster, the overwrite was successful. 

  • 0 kudos
giladba
by New Contributor III
  • 2374 Views
  • 9 replies
  • 4 kudos

access to event_log TVF

Hi, According to the documentation:https://docs.databricks.com/en/delta-live-tables/observability.html"The event_log TVF can be called only by the pipeline owner and a view created over the event_log TVF can be queried only by the pipeline owner. The...

  • 2374 Views
  • 9 replies
  • 4 kudos
Latest Reply
hcjp
New Contributor II
  • 4 kudos

As per this documentation, https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/unity-catalog, the issue here is documented as a current Limitation:LimitationsThe following are limitations when using Unity Catalog with Delta Live Tabl...

  • 4 kudos
8 More Replies
Babu_Krishnan
by Contributor
  • 37 Views
  • 1 replies
  • 0 kudos

Why my DLT is not working with UC?

My IAM profile is not working when accessing the SQS for file notification based ingestion?

  • 37 Views
  • 1 replies
  • 0 kudos
Latest Reply
jacovangelder
Contributor II
  • 0 kudos

I'm not sure if I fully understand the question, but what location are you monitoring? Is it a DBFS path or mount? If so, consider using a UC Volume. 

  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels