Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Let's assume a table contains more than 40 columns, now we know it automatically collects stat for the first 32 columns. If we run a z-order on a particular column(let's say column 1), then will the log file collect stats for all the 32 columns or wi...
@Shubhadip Ghosh : Hope this helps. In Delta Lake, when you perform Z-Ordering on a particular column, it reorganizes the data within the files based on the values of that column. However, Z-Ordering itself does not directly affect the statistics co...
Let's say there is a delta table with a date field as its partition. In a table where condition, we delete all the rows according to the division. The data is currently being inserted into the same date field. If we do a z-order after inserting the d...
@Shubhadip Ghosh :In Delta Lake, when you perform a delete operation on a table, it doesn't physically remove the data from the files. Instead, it marks the affected rows for deletion by adding a tombstone marker to the Delta transaction log. This e...
Presenting top 3 members who contributed to Community last week between 11th June-17th June- @Tyler Heflin @Werner Stinckens and @Bharathan K We would like to express our gratitude for your participation and dedication in the Databricks Commun...
Hi, we have several clusters that keep giving this error:Failure starting repl. Try detaching and re-attaching the notebook.All the investigation I've done points to this issue being related to the number of concurrent connections but we only have 1 ...
@Aviral Bhardwaj thanks, this seemed to fix the issue, we had an innit script that was potentially conflicting with UI set libraries (in cluster settings).
Azure subscription- disabledDatabricks subscription- free trial 13 day leftDatabricks host- AzureThe cluster is not getting created as my Azure subscription has been disabled after a month of free trial. However, Databricks subscription has still got...
Hi @Aanchal Soni We haven't heard from you since the last response from @Tyler Retzlaff , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to o...
Hi @Pavan Kumar Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...
Hello, I am trying to serve a model endpoint (using Databricks GUI) for a model that was successfully logged to the Model Registry. However, the endpoint creation failed with the following errors: Endpoint logs with error messagesEndpoint events with...
Hi @Nikhil Gajghate We haven't heard from you since the last response from @Kaniz Fatma , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to o...
I have attended the training last year of databricks to gain knowledge and help the clients but later i got to know that there are vouchers also available for which survey needs to complete which i completed now. I have already given some of the exam...
Hi @Shishir Shivhare Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...
HelloI'm developing a dlt pipeline, configured in continuous mode.I'm still in dev mode, so I stop my pipeline when i'm not working on it.My problem is that the pipeline is frequently started by SERVICE_UPGRADE.example of message:'Update xxxxx starte...
Hi! I have a problem. I'm using an autoloader to ingest data from raw to a Delta Lake, but when my pipeline starts, I want to apply the pipeline only to the new data. The autoloader ingests data into the Delta Lake, but now, how can I distinguish the...
Hi @Alejandro Piury Pinzón We haven't heard from you since the last response from @Tyler Retzlaff , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be he...
According to the documentation, the usage of external locations is preferred over the use of mount points.Unfortunately the basic funtionality to manipulate files seems to be missing.This is my scenario:create a download folder in an external locatio...
The main problem was related to the network configuration of the storage account: Databricks did not have access. Quite strange that it did manage to create folders...Currently dbutils.fs functionality is working.For the zipfile manipulation: that on...
Hey, I get wide table format in csv file. Where each sensor have its own column. I want to store it in Delta Live Streaming Table. But since it is inefficient to process it and storage space, due to varying frequency and sensor amount. I want to tran...
Hi @Simen Småriset,Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us s...
I have successfully used the VSCode extension for Databricks to run a notebook on a cluster from my IDE. However in order to test effectively without changing the source, I need a way to pass parameters to the workflow job.I have tried various ways ...
When we try and do the above I am able to get the list of schemas. But when I select one to injest we are then getting issue due to it trying to access system.lineage.table_lineage. When I look in the System catalog I can only see a schema called inf...
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.