cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

prasadvaze
by Valued Contributor
  • 2670 Views
  • 2 replies
  • 1 kudos

Resolved! How to start local/city databricks user group?

Hello Lindsey, I would like to start Richmond, VA databricks user group (chapter) . How do I go about doing this? 

  • 2670 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @prasad_vaze, Thank you for your interest in starting a Databricks user group in Richmond, VA! It’s a great initiative to foster collaboration and knowledge sharing among Databricks enthusiasts. I will let my team reach out to you on the same.

  • 1 kudos
1 More Replies
Rdipak
by New Contributor II
  • 671 Views
  • 2 replies
  • 0 kudos

Delta live table blocks pipeline autoloader rate limit

I have created a ETL pipeline with DLT. My first step is to ingest into raw delta table using autoloader file notification. when I have 20k notification pipe line run well across all stages. But when we have surge in number of messages pipeline waits...

  • 671 Views
  • 2 replies
  • 0 kudos
Latest Reply
kulkpd
Contributor
  • 0 kudos

Did you try following options:.option('cloudFiles.maxFilesPerTrigger', 10000) or maxBytesPerTrigger ?

  • 0 kudos
1 More Replies
AndrewSilver
by New Contributor II
  • 504 Views
  • 1 replies
  • 1 kudos

Uncertainty on Databricks job variables: {{run_id}}, {{parent_run_id}}.

In Azure's Databricks jobs, {{run_id}} and {{parent_run_id}} serve as variables. In jobs with multiple tasks, {{run_id}} aligns with task_run_id, while {{parent_run_id}} matches job_run_id. In single-task jobs, {{parent_run_id}} aligns with task_run_...

  • 504 Views
  • 1 replies
  • 1 kudos
Latest Reply
kulkpd
Contributor
  • 1 kudos

I am using job with single task and multiple retry.Upon job retry the run_id get changed, I tried to using  {{parent_run_id}} but never worked so switched to val parentRunId = dbutils.notebook.getContext.tags("jobRunOriginalAttempt")

  • 1 kudos
Direo
by Contributor
  • 5562 Views
  • 3 replies
  • 1 kudos

Resolved! JavaPackage object is not callable - pydeequ

Hi!When I run a notebook on databricks, it throws error - " 'JavaPackage' object is not callable" which points to pydeequ library:/local_disk0/.ephemeral_nfs/envs/pythonEnv-3abbb1aa-ee5b-48da-aaf2-18f273299f52/lib/python3.8/site-packages/pydeequ/che...

  • 5562 Views
  • 3 replies
  • 1 kudos
Latest Reply
JSatiro
New Contributor II
  • 1 kudos

Hi. If you are struggling like I was, these were the steps I followed to make it work:1 - Created a cluster with Runtime 10.4 LTS, which has spark version 3.2.1 (it should work with more recent runtimes, but be aware of the spark version)2 - When cre...

  • 1 kudos
2 More Replies
NathanE
by New Contributor II
  • 735 Views
  • 1 replies
  • 1 kudos

Time travel on views

Hello,At my company, we design an application to analyze data, and we can do so on top of external databases such as Databricks. Our application cache some data in-memory and to avoid synchronization issues with the data on Databricks, we rely heavil...

  • 735 Views
  • 1 replies
  • 1 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 1 kudos

@NathanE As you said, based on below article it may not support currenlty https://docs.databricks.com/en/sql/user/materialized-views.html, but at the same time looks as Materialized View is built on top of table and It is synchronous operation ( when...

  • 1 kudos
DatabricksNIN
by New Contributor II
  • 704 Views
  • 2 replies
  • 0 kudos

Pulling data from Azure Boards (Specifically 'Analytics Views' into databricks

Building upon a previous post/topic from one year ago.. I am looking for best practises/examples on how to pull data  from Azure Boards and specifically from 'Analytics Views' into databricks for analysis.I have succeeded in doing so with 'Work Items...

  • 704 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @DatabricksNIN , To pull data from Azure Boards and specifically from ‘Analytics Views’ into Databricks for analysis, you can use the Azure DevOps REST API.

  • 0 kudos
1 More Replies
erigaud
by Honored Contributor
  • 1505 Views
  • 3 replies
  • 0 kudos

Combining DLT and workflow - MATERIALIZED_VIEW_OPERATION_NOT_ALLOWED

Hello everyone !I currently have a DLT pipeline that loads into several Delta LIVE tables (both streaming and materialized view).The end table of my DLT pipeline is a materialized view called "silver.my_view".In a later step I need to join/union/merg...

  • 1505 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @erigaud , To read a table from a DLT pipeline with a regular non-shared cluster, you can use the dlt.table function in Databricks.  This function reads data from a table registered in the Hive metastore.

  • 0 kudos
2 More Replies
JonLaRose
by New Contributor III
  • 1063 Views
  • 2 replies
  • 0 kudos

Adding custom Jars to SQL Warehouses

Hi there,I want to add custom JARs to an SQL warehouse (Pro if that matters) like I can in an interactive cluster, yet I don't see a way.Is that a degraded functionality when transitioning to a SQL warehouse, or have I missed something? Thank you. 

  • 1063 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @JonLaRose ,  You can add custom JARs to an SQL warehouse in Databricks. The ADD JAR command is used to add a JAR file to the list of resources in Databricks Runtime.  Here’s an example of how to use the ADD JAR command: ADD JAR /tmp/test.jar; Th...

  • 0 kudos
1 More Replies
chari
by Contributor
  • 2745 Views
  • 3 replies
  • 1 kudos

Cant connect power BI desktop to Azure databricks

Hello,I am trying to connect Power BI desktop to azure databricks (source: delta table) by downloading a connection file from Databricks. I see an error message like below when I open the connection file with power BI. Repeated attempts have given th...

  • 2745 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @chari , To resolve this issue, I would recommend checking the following:   Ensure that the connection file you downloaded from Databricks is correct and up-to-date.Check if the Databricks server is up and running.Verify that the Databricks server...

  • 1 kudos
2 More Replies
Michael_Appiah
by New Contributor III
  • 4445 Views
  • 3 replies
  • 1 kudos

Resolved! Hashing Functions in PySpark

Hashes are commonly used in SCD2 merges to determine whether data has changed by comparing the hashes of the new rows in the source with the hashes of the existing rows in the target table. PySpark offers multiple different hashing functions like:MD5...

  • 4445 Views
  • 3 replies
  • 1 kudos
Latest Reply
Michael_Appiah
New Contributor III
  • 1 kudos

Hi @Kaniz ,thank you for your comprehensive answer. What is your opinion on the trade-off between using a hash like xxHASH64 which returns a LongType column and thus would offer good performance when there is a need to join on the hash column versus ...

  • 1 kudos
2 More Replies
VtotheG
by New Contributor
  • 752 Views
  • 0 replies
  • 0 kudos

Problem Visual Studio Plugin with custom modules

We are using the Databricks Visual Studio Plugin to write our python / spark code.We are using the upload file to databricks functionality because our organisation has turned unity catelog off. We are now running into a weird bug with custom modules....

Data Engineering
databricks visual studio plug in
visual studio code
  • 752 Views
  • 0 replies
  • 0 kudos
alj_a
by New Contributor III
  • 736 Views
  • 1 replies
  • 1 kudos

Connect databricks delta lake which is hosted in AWS from PowerBI - conn str/push dataset

Hi,I have a requirement. databricks has been hosted in AWS. and, i need to read the delta table from powerbi. tried push dataset but not working. is there any way to connect.we are using Active Directory as company wide

Data Engineering
aws databrics
  • 736 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @alj_a, it is possible to connect Power BI to Delta Lake tables hosted on Databricks on AWS. You can use the Azure Databricks Power BI connector to connect Power BI Desktop to your Azure Databricks clusters and Databricks SQL warehouses 12.   Here...

  • 1 kudos
sriradh
by New Contributor
  • 719 Views
  • 1 replies
  • 0 kudos

Resolved! ACID properties in delta?

How are locks maintained within a Delta Lake? For instance, lets say there are 2 simple tables, customer_details and say orders. Lets say I am running a job that will say insert an order in the orders table for say $100 for a specific customerId, it ...

Data Engineering
acid
delta
  • 719 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @sriradh,  In Delta Lake, ACID transaction guarantees are provided between reads and writes. This means multiple writers across multiple clusters can modify a table partition simultaneously. Writers see a consistent snapshot view of the table, and...

  • 0 kudos
RiyuLite
by New Contributor III
  • 1380 Views
  • 7 replies
  • 4 kudos

Where do I get Account level logs after enabling diagnostic logs for Azure databricks?

I need to retrieve the accountBillage usage from Audit logsI have enabled Diagnostic logs, and it's been 36 hours. While enabling the logs , I selected every possible logs in this image. But still i am not able to see the containers for account level...

RiyuLite_3-1696492947786.png RiyuLite_2-1696492911080.png
  • 1380 Views
  • 7 replies
  • 4 kudos
Latest Reply
RiyuLite
New Contributor III
  • 4 kudos

Hi @Kaniz , I checked Azure Monitoring and log delivery documentations, The log delivery is same as workspace level.What is the procedure to enable account level service in audit logs for Azure ? 

  • 4 kudos
6 More Replies
sanjay
by Valued Contributor II
  • 1082 Views
  • 3 replies
  • 4 kudos

Resolved! Trigger Events in data pipeline

Hi,I am running datapipeline in databrick using matillion architecture. I am facing inconsistent events in silver to gold layer in case any row deleted/updated from a partition. Let me explain with example.e.g. I have data in silver layer with partit...

  • 1082 Views
  • 3 replies
  • 4 kudos
Latest Reply
sanjay
Valued Contributor II
  • 4 kudos

Thank you Kaniz. Further queries on this.1. If I have nested partitions e.g. on department & date, finance->09, finance->10 and if am updating one record in finance->09 then will then updates partition finance->10 as well2. Is it good idea to have sm...

  • 4 kudos
2 More Replies
Labels
Top Kudoed Authors