cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

hari-prasad
by Valued Contributor
  • 126 Views
  • 0 replies
  • 2 kudos

Databricks UniForm - Bridging Delta Lake and Iceberg

Databricks UniForm, enables seamless integration between Delta Lake and Iceberg formats. Databricks UniForm key features include:Interoperability: Read Delta tables with Iceberg clients without rewriting data.Automatic Metadata Generation: Asynchrono...

  • 126 Views
  • 0 replies
  • 2 kudos
LGABI
by New Contributor
  • 123 Views
  • 2 replies
  • 0 kudos

How to connect to Tableau Server FROM within Databricks Notebooks and publish data to Tableau Serv?

My company is having trouble connecting Databricks to Tableau Server. We need to be able to publish Hyper Files that are developed using Python on Databricks Notebooks to our Tableau Server, but it seems impossible to get a connection established des...

  • 123 Views
  • 2 replies
  • 0 kudos
Latest Reply
pgo
New Contributor II
  • 0 kudos

Please use netcat command for testing connection.

  • 0 kudos
1 More Replies
TejeshS
by New Contributor II
  • 345 Views
  • 7 replies
  • 0 kudos

How to enable row tracking on Delta Live Tables?

We are encountering a scenario where we need to enable support for Incremental Processing on Materialized views having DLT base tables. However, we have observed that the compute is being executed with the COMPLETE_RECOMPUTE mode instead of INCREMENT...

  • 345 Views
  • 7 replies
  • 0 kudos
Latest Reply
TejeshS
New Contributor II
  • 0 kudos

Moreover, we have CDF enabled DLT tables, but as per documentation we see a limitation if CDF is enabled then row Tracking won't be possible. Use row tracking for Delta tables | Databricks on AWSBut as per our use case we need incremental processing ...

  • 0 kudos
6 More Replies
zed
by New Contributor III
  • 389 Views
  • 6 replies
  • 0 kudos

Resolved! ConcurrentAppendException in Feature Engineering write_table

I am using the Feature Engineering client when writing to a time series feature table. Then I have cried two data bricks jobs with the below code. I am running with different run_dates (e.g. '2016-01-07' and '2016-01-08'). When they run concurrently,...

  • 389 Views
  • 6 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@zed Clustering by your date column can indeed help avoid the ConcurrentAppendException without incurring the strict partitioning constraints that a “time series feature table” normally disallows. Unlike partitioning, CLUSTER BY does not create physi...

  • 0 kudos
5 More Replies
guiferviz
by New Contributor III
  • 481 Views
  • 8 replies
  • 4 kudos

Resolved! How to Determine if Materialized View is Performing Full or Incremental Refresh?

I'm currently testing materialized views and I need some help understanding the refresh behavior. Specifically, I want to know if my materialized view is querying the full table (performing a full refresh) or just doing an incremental refresh.From so...

  • 481 Views
  • 8 replies
  • 4 kudos
Latest Reply
TejeshS
New Contributor II
  • 4 kudos

To validate the status of your materialized view (MV) refresh, run a DESCRIBE EXTENDED command and check the row corresponding to the "last refresh status type."RECOMPUTE indicates a full load execution was completed.NO_OPERATION means no operation w...

  • 4 kudos
7 More Replies
kazinahian
by New Contributor III
  • 2526 Views
  • 2 replies
  • 0 kudos

Lowcode ETL in Databricks

Hello everyone,I work as a Business Intelligence practitioner, employing tools like Alteryx or various low-code solutions to construct ETL processes and develop data pipelines for my Dashboards and reports. Currently, I'm delving into Azure Databrick...

  • 2526 Views
  • 2 replies
  • 0 kudos
Latest Reply
Nam_Nguyen
Databricks Employee
  • 0 kudos

Hello @kazinahian , Azure Databricks offers several options for building ETL (Extract, Transform, Load) data pipelines, ranging from low-code to more code-centric approaches: Delta Live Tables Delta Live Tables (DLT) is a declarative framework for bu...

  • 0 kudos
1 More Replies
NathanSundarara
by Contributor
  • 1803 Views
  • 1 replies
  • 0 kudos

Lakehouse federation bringing data from SQL Server

Did any one tried to bring data using the newly announced Lakehouse federation and ingest using DELTA LIVE TABLES? I'm currently testing using Materialized Views. First loaded the full data and now loading last 3 days daily and recomputing using Mate...

Data Engineering
dlt
Lake house federation
  • 1803 Views
  • 1 replies
  • 0 kudos
Latest Reply
Nam_Nguyen
Databricks Employee
  • 0 kudos

Hi @NathanSundarara , regarding your current approach, here are the potential solutions and considerations- Deduplication: Implement deduplication strategies within your DLT pipeline. For example clicksDedupDf = ( spark.readStream.table("LIVE.rawCl...

  • 0 kudos
RotemBar
by New Contributor II
  • 167 Views
  • 3 replies
  • 1 kudos

Incremental refresh - non serverless compute

Hey,I read the page about incremental refresh. Will you make it available on more than just serverless compute?If so, when?ThanksReference - https://docs.databricks.com/en/optimizations/incremental-refresh.html

  • 167 Views
  • 3 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Sure thing, will keep you posted in a DM.

  • 1 kudos
2 More Replies
Phani1
by Valued Contributor II
  • 177 Views
  • 3 replies
  • 1 kudos

Databricks+DBT best practices

Hi All,could you provide the best practices for building and optimizing DBT models in databricks.Regards,Phani

  • 177 Views
  • 3 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

@Phani1 Please let us know if after going through @szymon_dybczak references, you still need some guidance on more specific aspects that we can help with.

  • 1 kudos
2 More Replies
jeremy98
by Contributor
  • 412 Views
  • 11 replies
  • 0 kudos

Resolved! how to read the CDF logs in DLT Pipeline?

Hi Community,How to read the CDF logs in materialized views created by DLT Pipeline?Thanks for you time,

  • 412 Views
  • 11 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@jeremy98 correct, If permissions management is complex, consider using standard Delta tables with CDF enabled and orchestrate changes through Databricks Workflows. This approach simplifies collaboration and avoids issues with restricted internal sch...

  • 0 kudos
10 More Replies
jdata
by New Contributor II
  • 2631 Views
  • 5 replies
  • 1 kudos

Dashboard Usage

Hi there,My team is developing some SQL Dashboards. I would like to know how many people view that dashboard/or at least click to it and then queries triggered.I found out that there is one endpoint provided by Databricks: List Queries | Query Histor...

  • 2631 Views
  • 5 replies
  • 1 kudos
Latest Reply
jdata
New Contributor II
  • 1 kudos

When I click to the dashboard, there are 6 statements in my dashboard -> I receive 6 records in `system.access.audit`.But the event_time is different, I expect event_time should be the same across records. So with the differences in event time, how c...

  • 1 kudos
4 More Replies
Phani1
by Valued Contributor II
  • 227 Views
  • 1 replies
  • 1 kudos

Access the data from cross-cloud.

Hi All ,We have a use case  where we need to connect AWS Databricks to a GCP storage bucket to access the data. In Databricks We're trying to use external locations and storage credentials, but it seems like AWS Databricks only supports AWS storage b...

  • 227 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Phani1 ,You can use delta sharing. In that way you can create share that will allow you to access data stored in GCS and it's govern by UC permissions model.What is Delta Sharing? | Databricks on AWSYou can also use legacy approach, but it doesn'...

  • 1 kudos
svm_varma
by New Contributor II
  • 261 Views
  • 1 replies
  • 2 kudos

Resolved! Azure Databricks quota restrictions on compute in Azure for students subscription

Hi All,Regrading creating clusters in Databricks I'm getting quota error have tried to increase quotas in the region where the resource is hosted still unable to increase the limit, is there any workaround  or could you help select the right cluster ...

svm_varma_1-1735552504129.png svm_varma_0-1735552319815.png svm_varma_2-1735552549290.png
  • 261 Views
  • 1 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @svm_varma ,You can try to create Standard_DS3_v2 cluster. It has 4 cores and your current subscription limit for given region is 6 cores. The one you're trying to create needs 8 cores and hence you're getting quota exceeded exception.You can also...

  • 2 kudos
dener
by New Contributor
  • 370 Views
  • 1 replies
  • 0 kudos

Infinity load execution

I am experiencing performance issues when loading a table with 50 million rows into Delta Lake on AWS using Databricks. Despite successfully handling other larger tables, this especific table/process takes hours and doesn't finish. Here's the command...

  • 370 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Thank you for your question! To optimize your Delta Lake write process: Disable Overhead Options: Avoid overwriteSchema and mergeSchema unless necessary. Use: df.write.format("delta").mode("overwrite").save(sink) Increase Parallelism: Use repartition...

  • 0 kudos
minhhung0507
by New Contributor III
  • 561 Views
  • 5 replies
  • 3 kudos

Resolved! Handling Dropped Records in Delta Live Tables with Watermark - Need Optimization Strategy

Hi Databricks Community,I'm encountering an issue with watermarks in Delta Live Tables that's causing data loss in my streaming pipeline. Let me explain my specific problem:Current SituationI've implemented watermarks for stateful processing in my De...

  • 561 Views
  • 5 replies
  • 3 kudos
Latest Reply
minhhung0507
New Contributor III
  • 3 kudos

 Dear @VZLA, @Walter_C ,I wanted to take a moment to express my sincere gratitude for your incredibly detailed explanation and thoughtful suggestions. Your guidance has been immensely valuable and has provided us with a clear path forward in addressi...

  • 3 kudos
4 More Replies
Labels