cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ajbush
by New Contributor III
  • 10337 Views
  • 6 replies
  • 2 kudos

Connecting to Snowflake using an SSO user from Azure Databricks

Hi all,I'm just reaching out to see if anyone has information or can point me in a useful direction. I need to connect to Snowflake from Azure Databricks using the connector: https://learn.microsoft.com/en-us/azure/databricks/external-data/snowflakeT...

  • 10337 Views
  • 6 replies
  • 2 kudos
Latest Reply
aagarwal
New Contributor II
  • 2 kudos

@ludgervisser We are trying to connect to Snowflake via Azure AD user through the externalbrowser method but the browser window doesn't open. Could you please share an example code of how you managed to achieve this, or to some documentation? @BobGeo...

  • 2 kudos
5 More Replies
Madman
by New Contributor II
  • 9730 Views
  • 7 replies
  • 6 kudos

Resolved! Snowflake connection to Databricks error

When I am trying to read snowflake table from my databricks notebook, it is giving the error as:df1.read.format("snowflake") \.options(**options) \.option("query", "select * from abc") \.save()Getting below errorjava.sql.SQLException: No suitable dri...

  • 9730 Views
  • 7 replies
  • 6 kudos
Latest Reply
pdiegop
New Contributor II
  • 6 kudos

@anurag2192 did you managed to solve it?

  • 6 kudos
6 More Replies
alexisjohnson
by New Contributor III
  • 5457 Views
  • 7 replies
  • 6 kudos

Resolved! Window function using last/last_value with PARTITION BY/ORDER BY has unexpected results

Hi, I'm wondering if this is the expected behavior when using last or last_value in a window function? I've written a query like this:select col1, col2, last_value(col2) over (partition by col1 order by col2) as column2_last from values ...

Screen Shot 2021-11-18 at 12.48.25 PM Screen Shot 2021-11-18 at 12.48.32 PM
  • 5457 Views
  • 7 replies
  • 6 kudos
Latest Reply
Carv
Visitor II
  • 6 kudos

For those stumbling across this; it seems LAST_VALUE emulates the same functionality as it does in SQL Server which does not, in most people's minds, have a proper row/range frame for the window. You can adjust it with the below syntax.I understand l...

  • 6 kudos
6 More Replies
Khalil
by Contributor
  • 1392 Views
  • 0 replies
  • 0 kudos

Snowpark vs Spark on Databricks

Why / When should we choose Spark on Databricks over Snowpark if the data we are processing is underlying in Snowflake?

  • 1392 Views
  • 0 replies
  • 0 kudos
pvignesh92
by Honored Contributor
  • 4720 Views
  • 6 replies
  • 2 kudos

Resolved! Optimizing Writes from Databricks to Snowflake

My job after doing all the processing in Databricks layer writes the final output to Snowflake tables using df.write API and using Spark snowflake connector. I often see that even a small dataset (16 partitions and 20k rows in each partition) takes a...

  • 4720 Views
  • 6 replies
  • 2 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 2 kudos

There are few options I tried out which had given me a better performance.Caching the intermediate or final results so that while writing the dataframe computation does not repeat again. Coalesce the results into the partitions 1x or 0.5x your number...

  • 2 kudos
5 More Replies
pvignesh92
by Honored Contributor
  • 3695 Views
  • 8 replies
  • 0 kudos

Resolved! Multi Statement Writes from Spark to Snowflake

Does Spark support multi statement writes to Snowflake in a single session? To elaborate, I have a requirement where I need to do A selective deletion of data from a Snowflake table and Insert records to Snowflake table ( Ranges from around 1 M rows)...

  • 3695 Views
  • 8 replies
  • 0 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 0 kudos

In my analysis, I got the below understanding If your data is sitting in Snowflake and you have a set of DDL/DML queries that need to wrapped into a single transaction, you can use MULTI_STATEMENT option to 0 and use snowflake utils runQuery method t...

  • 0 kudos
7 More Replies
hamzatazib96
by New Contributor III
  • 1387 Views
  • 2 replies
  • 1 kudos

Snowflake/GCP error: Premature end of chunk coded message body: closing chunk expected

Hello all,I've been experiencing the error described below, where I try to query a table from Snowflake which is about ~5.5B rows and ~30columns, and it fails almost systematically; specifically, either the Spark Job doesn't even start or I get the ...

  • 1387 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hey there @hamzatazib96​ Does @Kaniz Fatma​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies
NicolasEscobar
by New Contributor II
  • 6929 Views
  • 8 replies
  • 5 kudos

Resolved! Job fails after runtime upgrade

I have a job running with no issues in Databricks runtime 7.3 LTS. When I upgraded to 8.3 it fails with error An exception was thrown from a UDF: 'pyspark.serializers.SerializationError'... SparkContext should only be created and accessed on the driv...

  • 6929 Views
  • 8 replies
  • 5 kudos
Latest Reply
User16873042682
New Contributor II
  • 5 kudos

Adding to @Sean Owen​  comments, The only reason this is working is that the optimizer is evaluating this locally rather than creating a context on executors and evaluating it.

  • 5 kudos
7 More Replies
SajiD
by New Contributor
  • 815 Views
  • 1 replies
  • 0 kudos

Snowflake Connector for Databricks

Hi everyone, I am working with Databricks Notebooks and I am facing an issue with snowflake connector, I wanted to use DDL/DML with snowflake connector. Can someone please help me out with this, Thanks in advance !!

  • 815 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Sajid Shaikh​ , This article explains how to read data from and write data to Snowflake using the Databricks Snowflake connector.

  • 0 kudos
sgannavaram
by New Contributor III
  • 6555 Views
  • 6 replies
  • 4 kudos

Resolved! How to get the last time ( previous ) databricks job run time?

How to get the last databricks job run time? I have a requirement where i need to pass last job runtime as an argument in SQL and this SQL get the records from snowflake database based on this timestamp.  

  • 6555 Views
  • 6 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hey there @Srinivas Gannavaram​ Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members.Cheers!

  • 4 kudos
5 More Replies
Sam
by New Contributor III
  • 770 Views
  • 1 replies
  • 4 kudos

collect_set/ collect_list Pushdown

Hello,I've noticed that Collect_Set and Collect_List are not pushed down to the database?Runtime DB 9.1LTSSpark 3.1.2Database: SnowflakeIs there any way to get a distinct set from a group by in a way that will push down the query to the database?

  • 770 Views
  • 1 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

Hm so collect_set does not get translated to listagg.Can you try the following?use a more recent version of dbrxuse delta lake as spark sourceuse the latest version of the snowflake connectorcheck if pushdown to snowflake is enabled

  • 4 kudos
marchello
by New Contributor III
  • 3982 Views
  • 9 replies
  • 3 kudos

Resolved! error on connecting to Snowflake

Hi team, I'm getting weird error in one of my jobs when connecting to Snowflake. All my other jobs (I've got plenty) work fine. The current one also works fine when I have only one coding step (except installing needed libraries in my very first step...

  • 3982 Views
  • 9 replies
  • 3 kudos
Latest Reply
Dan_Z
Honored Contributor
  • 3 kudos

@marchello​ I suggest you contact Snowflake to move forward on this one.

  • 3 kudos
8 More Replies
Sam
by New Contributor III
  • 2319 Views
  • 2 replies
  • 1 kudos

Resolved! Query Pushdown in Snowflake

Hi,I am wondering what documentation exists on Query Pushdown in Snowflake.I noticed that a single function (monitonically_increasing_id()) prevented the entire query being pushed down to Snowflake during an ETL process. Is Pushdown coming from the S...

  • 2319 Views
  • 2 replies
  • 1 kudos
Latest Reply
siddhathPanchal
New Contributor III
  • 1 kudos

Hi Sam,The Spark Connector applies predicate and query pushdown by capturing and analyzing the Spark logical plans for SQL operations. When the data source is Snowflake, the operations are translated into a SQL query and then executed in Snowflake to...

  • 1 kudos
1 More Replies
User16790091296
by Contributor II
  • 686 Views
  • 1 replies
  • 1 kudos
  • 686 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 1 kudos

The open source spark connector for Snowflake is available by default in the Databricks runtime. To connect you can use the following code: # Use secrets DBUtil to get Snowflake credentials. user = dbutils.secrets.get("<scope>", "<secret key>") passw...

  • 1 kudos
Anonymous
by Not applicable
  • 622 Views
  • 0 replies
  • 0 kudos

Append subset of columns to target Snowflake table

I’m using the databricks-snowflake connector to load data into a Snowflake table. Can someone point me to any example of how we can append only a subset of columns to a target Snowflake table (for example some columns in the target snowflake table ar...

  • 622 Views
  • 0 replies
  • 0 kudos
Labels