cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AcrobaticMonkey
by New Contributor III
  • 1586 Views
  • 2 replies
  • 0 kudos

Alerts for Failed Queries in Databricks

How can we set up automated alerts to notify us when queries executed by a specific service principal fail in Databricks?

  • 1586 Views
  • 2 replies
  • 0 kudos
Latest Reply
AcrobaticMonkey
New Contributor III
  • 0 kudos

@Alberto_UmanaOur service principal uses the SQL Statement API to execute queries. We want to receive notifications for each query failure. While SQL Alerts are an option, they do not provide immediate responses. Is there a better solution to achieve...

  • 0 kudos
1 More Replies
seanstachff
by New Contributor II
  • 1783 Views
  • 5 replies
  • 0 kudos

Resolved! Using FROM_CSV giving unexpected results

Hello, I am trying to use from_csv in the sql warehouse, but I am getting unexpected results:As a small example I am running: WITH your_table AS ( SELECT 'a,b,c\n1,"hello, world",3.14\n2,"goodbye, world",2.71' AS csv_column ) SELECT from_csv(csv_c...

  • 1783 Views
  • 5 replies
  • 0 kudos
Latest Reply
Takuya-Omi
Valued Contributor III
  • 0 kudos

@seanstachff Here is the code I used to produce the results shown in the image I shared earlier. It's a bit verbose, so I’m not entirely satisfied with it, but I hope it might provide some helpful insights for you.%sql WITH your_table AS ( -- Examp...

  • 0 kudos
4 More Replies
martkev
by New Contributor III
  • 4751 Views
  • 1 replies
  • 0 kudos

Networking Setup in Standard Tier – VNet Integration and Proxy Issues

Hi everyone,We are working on an order forecasting model using azure databricks and an ml model from Hugging Face and are running into an issue where the connection over SSL (port 443) fails during the handshake (EOF Error SSL 992). We suspect that a...

  • 4751 Views
  • 1 replies
  • 0 kudos
Latest Reply
arjun_kr
Databricks Employee
  • 0 kudos

It may depend on your UDR setup. If you have a UDR rule routing the traffic to any firewall appliance, it may possibly be related to traffic not being allowed in the firewall. If there is no UDR or UDR rule routes this traffic to the Internet, it wou...

  • 0 kudos
Ulman
by New Contributor II
  • 7332 Views
  • 9 replies
  • 1 kudos

Switching to File Notification Mode with ADLS Gen2 - Encountering StorageException

Hello,We are currently utilizing an autoloader with file listing mode for a stream, which is experiencing significant latency due to the non-incremental naming of files in the directory—a condition that cannot be altered.In an effort to mitigate this...

Data Engineering
ADLS gen2
autoloader
file notification mode
  • 7332 Views
  • 9 replies
  • 1 kudos
Latest Reply
Rah_Cencora
New Contributor II
  • 1 kudos

You should also reevaluate your use of premium storage for your landing area files. Typically, storage for raw files does not need to be the fastest and most resilient and expensive tier. Unless you have a compelling reason for premium storage for la...

  • 1 kudos
8 More Replies
angelop
by New Contributor
  • 1099 Views
  • 1 replies
  • 0 kudos

Databricks Clean Rooms creation

I am trying to create a Databricks Clean Rooms instance, I have been following the video from Databricks youtube channel.As I only have one workspace, to create a clean rooms I have added my own Clean Room sharing identifier,when I do that I get the ...

  • 1099 Views
  • 1 replies
  • 0 kudos
Latest Reply
Takuya-Omi
Valued Contributor III
  • 0 kudos

@angelop I tried it as well and encountered the same error. A new collaborator needs to be set up. If that’s not feasible, it would be advisable to reach out to Databricks support.By the way, the following video provides a more detailed explanation a...

  • 0 kudos
The_Demigorgan
by New Contributor
  • 2088 Views
  • 1 replies
  • 0 kudos

Autoloader issue

I'm trying to ingest data from Parquet files using Autoloader. Now, I have my custom schema, I don't want to infer the schema from the parquet files.During readstream everything is fine. But during writestream, it is somehow inferring the schema from...

  • 2088 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

In this case, please make sure you specify the schema explicitly when reading the Parquet files and do not specify any inference options. Something like spark.readStream.format("cloudFiles").schema(schema)... If you want to more easily grab the schem...

  • 0 kudos
vmpmreistad
by New Contributor II
  • 9632 Views
  • 4 replies
  • 0 kudos

How to make structured streaming with autoloader efficiently and incrementally read files

TLDR format: How do I make a structured streaming job using autoloader read files using InMemoryFileIndex instead of DeltaFileOperations?I'm running a structured streaming job from an external (ADLS Gen2, abfss://), storage account which has avro fil...

vmpmreistad_0-1696330325302.png
  • 9632 Views
  • 4 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

@Wundermobility  The best way to debug this is to look at the Spark UI to see if a job has been launched. One thing to call out is that trigger.Once is deprecated - we recommend using trigger.availableNow instead to avoid overwhelming the cluster.

  • 0 kudos
3 More Replies
jonathan-dufaul
by Valued Contributor
  • 813 Views
  • 1 replies
  • 0 kudos

where to report a bug in the sql formatter?

I was wondering where I go to report a bug in the sql formatter. I tried sending an email to the helpdesk but they think I'm asking for support. I'm not. I just want to report a bug in the application because I think they should know about it. I don'...

jonathandufaul_0-1733417237553.png jonathandufaul_1-1733417252375.png
  • 813 Views
  • 1 replies
  • 0 kudos
Latest Reply
Takuya-Omi
Valued Contributor III
  • 0 kudos

HI, @jonathan-dufaul For such reports, I think it would be appropriate to click on your profile icon in the top-right corner of the workspace and use the "Send Feedback" option. 

  • 0 kudos
jonathan-dufaul
by Valued Contributor
  • 3850 Views
  • 2 replies
  • 1 kudos

How do I specify column types when writing to a MSSQL server using the JDBC driver (

I have a pyspark dataframe that I'm writing to an on-prem MSSQL server--it's a stopgap while we convert data warehousing jobs over to databricks. The processes that use those tables in the on-prem server rely on the tables maintaining the identical s...

  • 3850 Views
  • 2 replies
  • 1 kudos
Latest Reply
dasanro
Databricks Partner
  • 1 kudos

It's happenging to me too!Did you find any solution @jonathan-dufaul  ?Thanks!!

  • 1 kudos
1 More Replies
Yaadhudbe
by New Contributor II
  • 1004 Views
  • 1 replies
  • 0 kudos

AWS Databricks- Out of Memory issue in Delta live tables

I have been using Delta live tables more than a year and have implemented good number of DLT pipelines ingesting the data from S3 bucket using the SQS. One of my pipelines process large volume of data. The DLT pipeline reads the data using CloudFiles...

Yaadhudbe_0-1733411659495.png
  • 1004 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @Yaadhudbe, We would need to review your DLT setup, cluster settings and spark processing to better understand the OOM errors and possible suggestions to mitigate the issue. I suggest to file a case with us to conduct a proper investigation.  http...

  • 0 kudos
Maatari
by New Contributor III
  • 3762 Views
  • 3 replies
  • 0 kudos

AvailableNow Trigger and failure

Hi, I wonder what is the supposed to be the behavior of spark structured streaming when using the AvailableNow Trigger and there is a query failure during the query ? More specifically, what happens to the initial end offset set ? Does it change ? Wh...

  • 3762 Views
  • 3 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The AvailableNow trigger processes all available data as a single batch and then stops. This is different from continuous or micro-batch processing where the system continuously checks for new data. When a query starts with the AvailableNow trigger, ...

  • 0 kudos
2 More Replies
Balram-snaplogi
by New Contributor II
  • 2128 Views
  • 1 replies
  • 1 kudos

How can we customize the access token expiry duration?

Hi,I am using OAuth machine-to-machine (M2M) authentication. I created a service principal and wrote a Java application that allows me to connect to the Databricks warehouse. My question is regarding the code below:String url = "jdbc:databricks://<se...

  • 2128 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

I would say that your token should be manually refreshed as mentioned in the following statement in docs: Databricks tools and SDKs that implement the Databricks client unified authentication standard will automatically generate, refresh, and use Dat...

  • 1 kudos
RateVan
by New Contributor II
  • 5685 Views
  • 4 replies
  • 0 kudos

Spark last window dont flush in append mode

The problem is very simple, when you use TUMBLING window with append mode, then the window is closed only when the next message arrives (+watermark logic). In the current implementation, if you stop incoming streaming data, the last window will NEVER...

3P1l3
  • 5685 Views
  • 4 replies
  • 0 kudos
Latest Reply
Dtank
New Contributor II
  • 0 kudos

Do you have any solution for this ?

  • 0 kudos
3 More Replies
varunjaincse
by New Contributor III
  • 2286 Views
  • 2 replies
  • 1 kudos

Resolved! Databrick JDBC Driver making "List Column SQL" Query Everytime

I am trying to use the Databricks JDBC Spark Driver to run sql queries on the SQL WarehouseSample connection StringString TOKEN = "<token>";String HTTP_PATH = "/sql/1.0/warehouses/<sql-warehouse-id>";final String connStr = "jdbc:spark://discover.clou...

  • 2286 Views
  • 2 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

Hello, thank you for your question. The initial metadata query (the "Listing Columns" query) is tied to the SparkGetColumnsOperation class, which is part of the Apache Hive ThriftServer and Spark's handling of JDBC metadata operations. Can you please...

  • 1 kudos
1 More Replies
roshanjoebenny
by New Contributor III
  • 2612 Views
  • 7 replies
  • 1 kudos

Unity Catalog

When I try to connect my local postgres with databricks unity catalog I am facing issues. Could you please explain the steps in doing that

  • 2612 Views
  • 7 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

Databricks does not have connectivity to your local network out of the box.You should setup a VNet and VNet peering (and also firewall rules).Connect your Azure Databricks workspace to your on-premises network - Azure Databricks | Microsoft LearnNetw...

  • 1 kudos
6 More Replies
Labels