Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
End goal is to apply OPTIMIZE and ZORDER table.However, one of the columns to be ZORDER doesn't have stats collected.Running ANALYZE generates the error below.QueryANALYZE TABLE <catalog>.<schema>.<table> COMPUTE STATISTICS FOR COLUMNS my_col_1, my_c...
Hi Databricks Community,We faced a strange error today where the error below was returned when a notebook was being run. It only happens on git connected notebooks and on rerun it succeeds. What is the issue?
One of our databricks workflow job is failing occasionally with below error, after re-running and working fine without any issue.What is the exact reason for the issue and how can we fix itError:Unexpected failure while waiting for the cluster to be ...
These are cloud provider related errors and we will not have much error details from the error message. Based on the error message and also, that you have enough CPU/VM quota available, I think the issue might be due to the storage creation stage in ...
I have a job with multiple tasks like Task1 -> Task2 -> Task3. I am trying to call the job using api "run now". Task details are belowTask1 - It executes a Note Book with some input parametersTask2 - It runs using "ABC.jar", so its a jar based task ...
Hi,It would be a good feature to pass parameters at task level. We have scenarios where we would like to create a job with multiple tasks (notebook/dbt) and pass parameters at task level.
Hi @hadoan ,
Thank you for reaching out to our community! We're here to help you.
To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your feedback ...
I have been trying to find an alternative to copying a wheel file from my local file system to Databricks and then installing it into the cluster. Doing this databricks_client.dbutils.fs.cp("file:/local..../..whl", "dbfs:/Workspace/users/..../..whl")...
Hi @safoineext ,
Thank you for reaching out to our community! We're here to help you.
To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your feedb...
Hi All,I have been trying to leverage the system column lineage table to check the overall journey of a column. But i am getting inaccurate results wherever unpivot transformations are used.Instead of showing the results in a way that 20 columns are ...
Hi @Mahesh_Yadav ,
Thank you for reaching out to our community! We're here to help you.
To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your fee...
We're implementing a chatbot where documents in SharePoint and pages in Confluence augment the results. We want to adhere to existing RBAC policies in these data sources so that the chatbot doesn't produce results that someone should not see. Are you...
Hi @beautrincia ,
Thank you for reaching out to our community! We're here to help you.
To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your feed...
I am writing a file using this but the data type of columns get changed while reading. df.write.format("com.crealytics.spark.excel").option("header", "true").mode("overwrite").save(path) Due to this I have to manual change every time as I can't chang...
Hi @Tiwarisk ,
Thank you for reaching out to our community! We're here to help you.To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your feedback...
Hi Databricks community,I'm using Databricks Jobs Cluster to run some jobs. I'm setting the worker and driver type to AWS m6gd.large, which has 2 cores and 8G of memory each.After seeing it's defaulting executor memory to 2G, I wanted to increase it,...
I think I found the right answer here: https://kb.databricks.com/en_US/clusters/spark-shows-less-memoryIt seems it sets fixed size of ~4GB is used for internal node services. So depending on the node type, `spark.executor.memory` is fixed by Databric...
I know that currently foundation model with pay-per-token are not available in EU only in US. In EU I should create serving point and use provisioned foundation model. But even creating a serving point with llm from catalog (share models). I used the...
Hello,I am trying the Databricks Asset bundle for the first time. I am using Databricks CLI and can able to validate the bundle but when I am trying to run it it errors out error="expected a KEY of the resource to run" .In the resource yml file I ha...
Hi,We are working on ingesting multiple files from S3. The files name are fixed based on our source system, Files get replaced frequently with full feed. In DLT when we process new file we have to delete the records processed earlier of the same file...
Hi Everyone.I am trying to connect and read data from the Databricks table using SQL Warehouse and return it using Azure API.However, the non-English characters, for example, 'Ä', are present in the response as following: ��.I am using the databricks...
If Databricks support/Product Management following the forum, note that PDF from SIMBA in 2.6.28 does not discuss the name-value pairs in the above solution.Other errata includes PreparedMetadataLimitZero.