cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

NehaR
by New Contributor III
  • 221 Views
  • 2 replies
  • 2 kudos

Is there any option in databricks to estimate cost of a query before execution

Hi Team,I want to check if there is any option in data bricks which can help to estimate cost of a query before execution?I mean calculate DBU before actual query execution based on the logical plan? Regards 

  • 221 Views
  • 2 replies
  • 2 kudos
Latest Reply
NehaR
New Contributor III
  • 2 kudos

Is there any way to track the progress or ETA?Do we have access to ideas portal? Where can we search this reference number DB-I-5730? 

  • 2 kudos
1 More Replies
jeremy98
by Contributor
  • 389 Views
  • 2 replies
  • 2 kudos

Ways to write fast millions of rows inside a new delta table

Hello everyone,I am facing an issue with writing 100–500 million rows (partitioned by a column) into a newly created Delta table. I have set up a cluster with 256 GB of memory and 64 cores. However, the following code takes a considerable amount of t...

  • 389 Views
  • 2 replies
  • 2 kudos
Latest Reply
radothede
Contributor II
  • 2 kudos

Hi @jeremy98 This is what I would suggest to test:1) remove repartition step or reduce number or partitions (start with number of cores and then try to increase it x2, x3) repartition(num_partitions*4, partition_col) I know repartitioning helps to di...

  • 2 kudos
1 More Replies
DataGeek_JT
by New Contributor II
  • 1380 Views
  • 2 replies
  • 1 kudos

Is it possible to use Liquid Clustering on Delta Live Tables / Materialised Views?

Is it possible to use Liquid Clustering on Delta Live Tables? If it is available what is the Python syntax for adding liquid clustering to a Delta Live Table / Materialised view please? 

  • 1380 Views
  • 2 replies
  • 1 kudos
Latest Reply
kerem
New Contributor III
  • 1 kudos

Hi @amr, materialised views are not tables, they are views. Liquid clustering is not supported on views so it will throw [EXPECT_TABLE_NOT_VIEW.NO_ALTERNATIVE] error. Unfortunately it will be the same case for the "optimize" command as well. 

  • 1 kudos
1 More Replies
FabianGutierrez
by Contributor
  • 538 Views
  • 9 replies
  • 1 kudos

My DABS CLI Deploy call not generating a .tfstate file

Hi Community,I'm running into an issue, when executing Databricks CLI Bundle Deploy I dont get the Terraform State file (.tfstate). I know that I should get one but even when defining the state_apth on my YAML (.yml) DABS file I still do not get it.D...

FabianGutierrez_0-1731932526298.png
  • 538 Views
  • 9 replies
  • 1 kudos
Latest Reply
FabianGutierrez
Contributor
  • 1 kudos

Forgot to also share this print screen of the last section in the logs. Somehow the State file keeps getting ignored (not found) so how can the deployment still take place I wonder.  

  • 1 kudos
8 More Replies
joeyslaptop
by New Contributor II
  • 237 Views
  • 1 replies
  • 0 kudos

Resolved! How do I use DataBricks SQL query to convert a field value % back into a wildcard?

Hi.  If I've posted to the wrong area, please let me know.I am using SQL to join two tables.  One table has the wildcard '%' stored as text/string/varchar.  I need to join the value of TableA.column1 to TableB.column1 based on the wildcard in the str...

  • 237 Views
  • 1 replies
  • 0 kudos
Latest Reply
JAHNAVI
Databricks Employee
  • 0 kudos

Hi,Could you please try the query below and let me know if it meets your requirements? SELECT * FROM TableA A LEFT JOIN TableB B ON A.Column1 LIKE REPLACE(B.Column1, '%', '%%')Replace helps us in treating the %' stored in TableB.Column1 as a wildcar...

  • 0 kudos
swetha
by New Contributor III
  • 3126 Views
  • 4 replies
  • 1 kudos

Error: no streaming listener attached to the spark app is the error we are observing post accessing streaming statistics API. Please help us with this issue ASAP. Thanks.

Issue: Spark structured streaming applicationAfter adding the listener jar file in the cluster init script, the listener is working (From what I see in the stdout/log4j logs)But when I try to hit the 'Content-Type: application/json' http://host:port/...

  • 3126 Views
  • 4 replies
  • 1 kudos
Latest Reply
INJUSTIC
New Contributor II
  • 1 kudos

Have you found the solution? Thanks

  • 1 kudos
3 More Replies
swetha
by New Contributor III
  • 2729 Views
  • 3 replies
  • 1 kudos

I am unable to attach a streaming listener to a spark streaming job. Error: no streaming listener attached to the spark application is the error we are observing post accessing streaming statistics API. Please help us with this issue ASAP. Thanks.

Issue:After adding the listener jar file in the cluster init script, the listener is working (From what I see in the stdout/log4j logs)But when I try to hit the 'Content-Type: application/json' http://host:port/api/v1/applications/app-id/streaming/st...

  • 2729 Views
  • 3 replies
  • 1 kudos
Latest Reply
INJUSTIC
New Contributor II
  • 1 kudos

Have you found the solution? Thanks

  • 1 kudos
2 More Replies
dbuschi
by New Contributor II
  • 311 Views
  • 2 replies
  • 0 kudos

Resolved! Delta Live Tables: How does it identify new files?

Hi,I'm importing large numbers of parquet files (ca 5200 files per day, they each land in a separate folder) into Azure ADLS storage.I have a DLT streaming table reading from the root folder.I noticed a massive spike in storage account costs due to f...

  • 311 Views
  • 2 replies
  • 0 kudos
Latest Reply
dbuschi
New Contributor II
  • 0 kudos

To resolve the issue of excessive directory scanning, I have changed the folder structure to separate historical files from current files and reduce the number of folders and files that the Databrick process monitors.

  • 0 kudos
1 More Replies
KuruDev
by New Contributor II
  • 855 Views
  • 3 replies
  • 0 kudos

Databricks Asset Bundle - Not fully deploying in Azure Pipeline

 Hello Community, I'm encountering a challenging issue with my Azure Pipeline and I'm hoping someone here might have some insights. I'm attempting to deploy a Databricks bundle that includes both notebooks and workflow YAML files. When deploying the ...

  • 855 Views
  • 3 replies
  • 0 kudos
Latest Reply
adfo
New Contributor II
  • 0 kudos

Hello,Same issue here, files and wheel are deployed and present on the databricks workspace but the jobs are not created

  • 0 kudos
2 More Replies
TheoDeSo
by New Contributor III
  • 13218 Views
  • 8 replies
  • 5 kudos

Resolved! Error on Azure-Databricks write output to blob storage account

Hello,After implementing the use of Secret Scope to store Secrets in an azure key vault, i faced a problem.When writting an output to the blob i get the following error:shaded.databricks.org.apache.hadoop.fs.azure.AzureException: Unable to access con...

  • 13218 Views
  • 8 replies
  • 5 kudos
Latest Reply
nguyenthuymo
New Contributor II
  • 5 kudos

Hi all,Is it correct that Azure-Databricks only support to write data to Azure Data Lake Gen2 and does not support for Azure Storage Blob (StorageV2 - general purpose) ?In my case, I can read the data from Azure Storage Blob (StorageV2 - general purp...

  • 5 kudos
7 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1855 Views
  • 1 replies
  • 5 kudos

Notebook cell output results limit increased- 10,000 rows or 2 MB. Hi all, Now, databricks start showing the first 10000 rows instead of 1000 rows.Tha...

Notebook cell output results limit increased- 10,000 rows or 2 MB.Hi all,Now, databricks start showing the first 10000 rows instead of 1000 rows.That will reduce the time of re-execution while working on fewer sizes of data that have rows between 100...

  • 1855 Views
  • 1 replies
  • 5 kudos
Latest Reply
F_Goudarzi
New Contributor III
  • 5 kudos

Hi Ajay,Is there any way to increase this limit?Thanks, Fatima

  • 5 kudos
ac0
by Contributor
  • 5247 Views
  • 3 replies
  • 2 kudos

"Fatal error: The Python kernel is unresponsive." DBR 14.3

Running almost any notebook with a merge statement in Databricks with DBR 14.3 I get the following error and the notebook exists:"Fatal error: The Python kernel is unresponsive."I would provide more code, but like I said, it is pretty much anything w...

  • 5247 Views
  • 3 replies
  • 2 kudos
Latest Reply
markthepaz
New Contributor II
  • 2 kudos

Same thing, not finding any documentation out there around "spark.databricks.driver.python.pythonHealthCheckTimeoutSec". @ac0 or @Ayushi_Suthar any more details you found on this?

  • 2 kudos
2 More Replies
Dharinip
by New Contributor III
  • 972 Views
  • 4 replies
  • 3 kudos

Resolved! How to decide on creating views vs Tables in Gold layer?

We have the following use case:We receive raw form of data from an application and that is ingested in the Iron Layer. The raw data is in the JSON FormatThe Bronze layer will the first level of transformation. The flattening of the JSON file happens ...

  • 972 Views
  • 4 replies
  • 3 kudos
Latest Reply
madams
Contributor
  • 3 kudos

Oh no!  I swear I wrote a reply out, but I must not have submitted it.  Setting that aside...I think in your case a materialized view makes the most sense, and it's what I'd go with.  It's essentially precomputing the logic, but reusing some of the s...

  • 3 kudos
3 More Replies
thisisadarshsin
by New Contributor II
  • 2731 Views
  • 6 replies
  • 0 kudos

Permission issue in Fundamentals of the Databricks Lakehouse Platform Quiz

Hi ,I am getting this Error,when i am trying to give the exam ofFundamentals of the Databricks Lakehouse Platform.403FORBIDDENYou don't have permission to access this page2023-05-20 12:37:41 | Error 403 | https://customer-academy.databricks.com/I al...

  • 2731 Views
  • 6 replies
  • 0 kudos
Latest Reply
shubham5
New Contributor II
  • 0 kudos

Hi Team,Even I'm also getting the same error last 2 days while Taking the quiz and get your badge for Lakehouse Fundamentals.Error : You are not authorized to access https://customer-academy.databricks.com. Please select a platform you can access fro...

  • 0 kudos
5 More Replies
ChristianRRL
by Valued Contributor
  • 298 Views
  • 2 replies
  • 2 kudos

Resolved! PKEY Upserting Pattern With Older Runtimes

Hi there,I'm aware that nowadays newer runtimes of Databricks support some great features, including primary and foreign key constraints. I'm wondering, if we have clusters that are running older runtime versions, are there Upserting patterns that ha...

  • 298 Views
  • 2 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

For clusters running older Databricks runtime versions, such as 13.3, you can still implement upserting patterns effectively, even though they may not support the latest features like primary and foreign key constraints available in newer runtimes. O...

  • 2 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels