cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Starki
by New Contributor III
  • 2535 Views
  • 3 replies
  • 2 kudos

StreamingQueryListener onQueryTerminated in Databricks Job

I am defining a StreamingQueryListener that collects metrics on my Spark Structured Streaming tasks and sends them to a Prometheus Pushgateway.When the job is terminated, I want to use the onQueryTerminated to cleanup the metrics for each job from th...

Data Engineering
onQueryTerminated
StreamingQueryListener
  • 2535 Views
  • 3 replies
  • 2 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 2 kudos

@Starki  - Per documentation, StreamingQueryListener.onQueryTerminated is called when the query is stopped, e.g., StreamingQuery.stop.and each of these Python observable APIs work asynchronously. https://www.databricks.com/blog/2022/05/27/how-to-moni...

  • 2 kudos
2 More Replies
User16765131552
by Contributor III
  • 1786 Views
  • 2 replies
  • 0 kudos

Resolved! Disable welcome emails

Is it possible to disable the welcome emails that go to users when added to the workspace?

  • 1786 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16765131552
Contributor III
  • 0 kudos

I have found that it is possible to suppress welcome emails if users are added via the API using flags.

  • 0 kudos
1 More Replies
Ravikumashi
by Contributor
  • 1648 Views
  • 2 replies
  • 1 kudos

maven libraries installation issue on 11.3/12.2 LTS

we've encountered issue while attempting to install Maven libraries on Databricks clusters version 11.3 LTS. Specifically, we are encountering SSL handshake errors during the installation process. It's worth noting that these same libraries install w...

  • 1648 Views
  • 2 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
we've encountered issue while attempting to install Maven libraries on Databricks clusters version 11.3 LTS. Specifically, we are encountering SSL handshake errors during the installation process. It's worth noting that these same libraries install w...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.
1 More Replies
Loki
by New Contributor III
  • 2924 Views
  • 4 replies
  • 1 kudos

Resolved! Accessing ADLS Gen 2 Raw Files with UC ?

We are using service principal to access data from raw files such as json, CSV .I saw a video suggesting that it could be done via unity catalog as well.Could someone comment on this please ?

  • 2924 Views
  • 4 replies
  • 1 kudos
Latest Reply
donkyhotes
New Contributor II
  • 1 kudos

@Loki wrote:We are using service principal to access data from raw files such as json, CSV .Car GamesI saw a video suggesting that it could be done via unity catalog as well.Could someone comment on this please ?That's great! Service principals are a...

  • 1 kudos
3 More Replies
AndLuffman
by New Contributor II
  • 2044 Views
  • 2 replies
  • 1 kudos

QRY Results incorrect but Exported data is OK

I ran a query "Select * from fact_Orders".     This presented a lot of garbage,  The correct column headers, but the contents were extremely random, e.g.  blanks in the key column, VAT rates of 12282384234E-45  . When I export to CSV , it presents fi...

  • 2044 Views
  • 2 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
I ran a query "Select * from fact_Orders".     This presented a lot of garbage,  The correct column headers, but the contents were extremely random, e.g.  blanks in the key column, VAT rates of 12282384234E-45  . When I export to CSV , it presents fi...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.
1 More Replies
romangehrn
by New Contributor II
  • 682 Views
  • 0 replies
  • 0 kudos

speed issue DBR 13+ for R

I got a notebook running on DBR 12.2 with the following R code: install.packages("microbenchmark") install.packages("furrr") library(microbenchmark) library(tidyverse) # example tibble df_test <- tibble(id = 1:100000, street_raw = rep("Bahnhofs...

Data Engineering
DBR 13
performance slow
R
speed error
  • 682 Views
  • 0 replies
  • 0 kudos
210573
by New Contributor
  • 2496 Views
  • 3 replies
  • 2 kudos

Unable to stream from google pub/sub

I am trying to run below for subscribing to a pubsub but this code is throwing this exception java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/DataSourceV2I have tried using all versions of https://mvnrepository.com/artifact/com.google...

  • 2496 Views
  • 3 replies
  • 2 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 2 kudos

Hi @210573 Databricks now start supporting pub/sub streaming natively now you can start using pubsub streaming for your use case for more info visit below official URL -PUB/SUB with Databricks 

  • 2 kudos
2 More Replies
vonjack
by New Contributor II
  • 1750 Views
  • 2 replies
  • 0 kudos

How to unload a Jar for UDF without restart spark context?

In the scala notebook of databricks, I created a temporary function with a certain Jar and class name. Then I want to update the Jar. But without restart the context, I can not reload the new Jar, the temporary function always reuses the old classes....

  • 1750 Views
  • 2 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
In the scala notebook of databricks, I created a temporary function with a certain Jar and class name. Then I want to update the Jar. But without restart the context, I can not reload the new Jar, the temporary function always reuses the old classes....

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
1 More Replies
sparkrookie
by New Contributor II
  • 1551 Views
  • 2 replies
  • 0 kudos

Structured Streaming Delta Table - Reading and writing from same table

Hi I have a structured streaming job that reads from a delta table "A" and pushes to another delta table "B".A Schema - group_key, id, timestamp, valueB Schema - group_key, watermark_timestamp, derived_valueOne requirement is that i need to get the m...

  • 1551 Views
  • 2 replies
  • 0 kudos
Latest Reply
KarenGalvez
New Contributor III
  • 0 kudos

Navigating the intricacies of structured streaming and Delta table operations on the same platform has been a stimulating yet demanding task. The community at Databricks has been instrumental in clarifying nuances. As I delve deeper, I'm reminded of ...

  • 0 kudos
1 More Replies
shraddharane
by New Contributor
  • 22803 Views
  • 1 replies
  • 1 kudos

Migrating legacy SSAS cube to databricks

We have SQL database. Database is designed in star schema. We are migrating data from SQL to databricks. There are cubes designed using SSAS. These cubes are used for end users in excel for analysis purpose. We are now looking for solution for:1) Can...

  • 22803 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

Databricks itself does not deliver semantic models like SSAS cubes.  So Databricks cannot migrate them because there is nothing to migrate to.However, there are some options:- use of PowerBI instead of SSAS (there might even be a migrate option?).  W...

  • 1 kudos
140015
by New Contributor III
  • 1962 Views
  • 3 replies
  • 1 kudos

Resolved! Using DLT pipeline with non-incremental data

Hi,I would like to know what you think about using the Delta Live Tables when the source for this pipeline is not incremental. What I mean by that is suppose that the data provider creates for me a new folder with files each time it has update to the...

  • 1962 Views
  • 3 replies
  • 1 kudos
Latest Reply
Joe_Suarez
New Contributor III
  • 1 kudos

When dealing with B2B data building, the process of updating and managing your data can present unique challenges. Since your data updates involve new folders with files and you need to process the entire new folder, the concept of incremental proces...

  • 1 kudos
2 More Replies
GNarain
by New Contributor II
  • 5912 Views
  • 7 replies
  • 4 kudos

Resolved! Is there api call to set "Table access control" workspace config ?

Is there api call to set "Table access control" workspace config ?

  • 5912 Views
  • 7 replies
  • 4 kudos
Latest Reply
SvenPeeters
New Contributor III
  • 4 kudos

Faciing the same issue, tried to fetch the current value via /api/2.0/workspace-conf?keys=enableTableAccessControlUnfortunately this is returning a 400 {    "error_code": "BAD_REQUEST",    "message": "Invalid keys: [\"enableTableAccessControl\"]"}

  • 4 kudos
6 More Replies
Eldar_Dragomir
by New Contributor II
  • 1592 Views
  • 1 replies
  • 2 kudos

Resolved! Reprocessing the data with Auto Loader

Could you please provide me an idea how I can start reprocessing of my data? Imagine I have a folder in adls gen2 "/test" with binaryFiles. They already processed with current pipeline. I want to reprocess the data + continue receive new data. What t...

  • 1592 Views
  • 1 replies
  • 2 kudos
Latest Reply
Tharun-Kumar
Honored Contributor II
  • 2 kudos

@Eldar_Dragomir In order to re-process the data, we have to change the checkpoint directory. This will start processing the files from the beginning. You can use cloudFiles.maxFilesPerTrigger, to limit the number of files getting processed per micro-...

  • 2 kudos
anarad429
by New Contributor
  • 1351 Views
  • 1 replies
  • 1 kudos

Resolved! Unity Catalog + Reading variable from external notebook

I am trying to run a notebook which reads some of its variables from and external notebook (I used %run command for that purpose), but it keeps giving me error that these variables are not defined. These sequences of notebooks run perfectly fine on a...

  • 1351 Views
  • 1 replies
  • 1 kudos
Latest Reply
Atanu
Esteemed Contributor
  • 1 kudos

I think the issue here is the variable is not created until a value is assigned to it. So, you may need to assign a value to get_sql_schema

  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels