Hi @prasad_vaze, Thank you for your interest in starting a Databricks user group in Richmond, VA! It’s a great initiative to foster collaboration and knowledge sharing among Databricks enthusiasts. I will let my team reach out to you on the same.
I have created a ETL pipeline with DLT. My first step is to ingest into raw delta table using autoloader file notification. when I have 20k notification pipe line run well across all stages. But when we have surge in number of messages pipeline waits...
In Azure's Databricks jobs, {{run_id}} and {{parent_run_id}} serve as variables. In jobs with multiple tasks, {{run_id}} aligns with task_run_id, while {{parent_run_id}} matches job_run_id. In single-task jobs, {{parent_run_id}} aligns with task_run_...
I am using job with single task and multiple retry.Upon job retry the run_id get changed, I tried to using {{parent_run_id}} but never worked so switched to val parentRunId = dbutils.notebook.getContext.tags("jobRunOriginalAttempt")
Hi!When I run a notebook on databricks, it throws error - " 'JavaPackage' object is not callable" which points to pydeequ library:/local_disk0/.ephemeral_nfs/envs/pythonEnv-3abbb1aa-ee5b-48da-aaf2-18f273299f52/lib/python3.8/site-packages/pydeequ/che...
Hi. If you are struggling like I was, these were the steps I followed to make it work:1 - Created a cluster with Runtime 10.4 LTS, which has spark version 3.2.1 (it should work with more recent runtimes, but be aware of the spark version)2 - When cre...
Hello,At my company, we design an application to analyze data, and we can do so on top of external databases such as Databricks. Our application cache some data in-memory and to avoid synchronization issues with the data on Databricks, we rely heavil...
@NathanE As you said, based on below article it may not support currenlty https://docs.databricks.com/en/sql/user/materialized-views.html, but at the same time looks as Materialized View is built on top of table and It is synchronous operation ( when...
Building upon a previous post/topic from one year ago.. I am looking for best practises/examples on how to pull data from Azure Boards and specifically from 'Analytics Views' into databricks for analysis.I have succeeded in doing so with 'Work Items...
Hi @DatabricksNIN , To pull data from Azure Boards and specifically from ‘Analytics Views’ into Databricks for analysis, you can use the Azure DevOps REST API.
Hello everyone !I currently have a DLT pipeline that loads into several Delta LIVE tables (both streaming and materialized view).The end table of my DLT pipeline is a materialized view called "silver.my_view".In a later step I need to join/union/merg...
Hi @erigaud , To read a table from a DLT pipeline with a regular non-shared cluster, you can use the dlt.table function in Databricks.
This function reads data from a table registered in the Hive metastore.
Hi there,I want to add custom JARs to an SQL warehouse (Pro if that matters) like I can in an interactive cluster, yet I don't see a way.Is that a degraded functionality when transitioning to a SQL warehouse, or have I missed something? Thank you.
Hi @JonLaRose ,
You can add custom JARs to an SQL warehouse in Databricks. The ADD JAR command is used to add a JAR file to the list of resources in Databricks Runtime.
Here’s an example of how to use the ADD JAR command:
ADD JAR /tmp/test.jar;
Th...
Hello,I am trying to connect Power BI desktop to azure databricks (source: delta table) by downloading a connection file from Databricks. I see an error message like below when I open the connection file with power BI. Repeated attempts have given th...
Hi @chari , To resolve this issue, I would recommend checking the following:
Ensure that the connection file you downloaded from Databricks is correct and up-to-date.Check if the Databricks server is up and running.Verify that the Databricks server...
Hashes are commonly used in SCD2 merges to determine whether data has changed by comparing the hashes of the new rows in the source with the hashes of the existing rows in the target table. PySpark offers multiple different hashing functions like:MD5...
Hi @Kaniz ,thank you for your comprehensive answer. What is your opinion on the trade-off between using a hash like xxHASH64 which returns a LongType column and thus would offer good performance when there is a need to join on the hash column versus ...
We are using the Databricks Visual Studio Plugin to write our python / spark code.We are using the upload file to databricks functionality because our organisation has turned unity catelog off. We are now running into a weird bug with custom modules....
Hi,I have a requirement. databricks has been hosted in AWS. and, i need to read the delta table from powerbi. tried push dataset but not working. is there any way to connect.we are using Active Directory as company wide
Hi @alj_a, it is possible to connect Power BI to Delta Lake tables hosted on Databricks on AWS. You can use the Azure Databricks Power BI connector to connect Power BI Desktop to your Azure Databricks clusters and Databricks SQL warehouses 12.
Here...
How are locks maintained within a Delta Lake? For instance, lets say there are 2 simple tables, customer_details and say orders. Lets say I am running a job that will say insert an order in the orders table for say $100 for a specific customerId, it ...
Hi @sriradh,
In Delta Lake, ACID transaction guarantees are provided between reads and writes. This means multiple writers across multiple clusters can modify a table partition simultaneously. Writers see a consistent snapshot view of the table, and...
I need to retrieve the accountBillage usage from Audit logsI have enabled Diagnostic logs, and it's been 36 hours. While enabling the logs , I selected every possible logs in this image. But still i am not able to see the containers for account level...
Hi @Kaniz , I checked Azure Monitoring and log delivery documentations, The log delivery is same as workspace level.What is the procedure to enable account level service in audit logs for Azure ?
Hi,I am running datapipeline in databrick using matillion architecture. I am facing inconsistent events in silver to gold layer in case any row deleted/updated from a partition. Let me explain with example.e.g. I have data in silver layer with partit...
Thank you Kaniz. Further queries on this.1. If I have nested partitions e.g. on department & date, finance->09, finance->10 and if am updating one record in finance->09 then will then updates partition finance->10 as well2. Is it good idea to have sm...