Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
When we try and do the above I am able to get the list of schemas. But when I select one to injest we are then getting issue due to it trying to access system.lineage.table_lineage. When I look in the System catalog I can only see a schema called inf...
Hi everyone,I was wondering if perhaps someone of you could tell me which kinds of outputs are kept in a notebook after the cluster to which it is attached is terminated... Actually, I am asking it especially because I lost some visualization that I ...
I need to update a single row on a on-prem Oracle table via jdbc connection.Please note, I don't want to append, just have to update a row, is it possible ?
I was using String indexer, while fitting, transforming I didn't get any erro. but While runnign show function I am getting error, I mention the error beloworg.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 45.0 failed...
When I go to add data, I see that the Starter Warehouse Pro cluster spun up after the first use and has been there for a long time. It does not show in my clusters and I can't find a way to shut it down. Am I being charged for this? If so, how do I s...
Hi All. I have a scenario where there are few .sql scripts present in my repo. Is there any way we can execute those SQLs on Databricks via Azure DevOps CI/CD pipeline?Please help.
Hi @Divyansh Jain Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...
Here's me use case: I'm migrating out of an old DWH, into Databricks. When moving dimension tables into Databricks, I'd like old SKs (surrogate keys) to be maintained, while creating the SKs column as an IDENTITY column, so new dimension values get a...
We are having some issues with merge performance, so I went and read a bit in the documentation, I found this section:https://docs.databricks.com/delta/tune-file-size.html#autotune-file-size-based-on-workload"Databricks recommends setting the table p...
Hello, I have seen in many places readStream and writeStream in gold layer, Is it correct to use readStream and writeStream for gold layer ? knowing that a gold table is no not valid for streaming.is there some logic when to use readStream/ writeStr...
Hi @Ibrahim ISSOUANI Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.
While working on my school's Linux server, I encountered an issue while attempting to install and import Ray in my Jupyter Notebook. I successfully installed the package ray==2.4.0, but encountered an error when trying to import it, specifically stat...
Similar issue: https://stackoverflow.com/questions/76220211/create-new-databricks-cluster-from-adf-linked-service-with-initscripts-from-abfsI am trying to create clusters using ADF linked service where the cluster is configured with a init script. As...
According to the alert docs (here), HTML tags should work to format messages in a custom template. When I tried using them, it doesn't seem able to recognize them however and just returns the whole text.ie
Hi, I tried to deploy a Feature Store packaged model into Delta Live Table using mlflow.pyfunc.spark_udf in Azure Databricks. This model is built by Databricks autoML with joined Feature Table inside it.And I'm trying to make prediction using the fol...