cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

kll
by New Contributor III
  • 8180 Views
  • 5 replies
  • 0 kudos

AnalysisException : when attempting to save a spark DataFrame as delta table

I get an, `AnalysisException Failed to merge incompatible data types LongType and StringTypewhen attempting to run the below command, `df.write.format("delta").saveAsTable("schema.k_adhoc.df", mode="overwrite")` I am casting the column before saving:...

  • 8180 Views
  • 5 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

The issue seems to be because the job is trying to merge columns with different schema. Could you please make sure that the schema matches for the columns.

  • 0 kudos
4 More Replies
alexisjohnson
by New Contributor III
  • 13465 Views
  • 5 replies
  • 7 kudos

Resolved! Window function using last/last_value with PARTITION BY/ORDER BY has unexpected results

Hi, I'm wondering if this is the expected behavior when using last or last_value in a window function? I've written a query like this:select col1, col2, last_value(col2) over (partition by col1 order by col2) as column2_last from values ...

Screen Shot 2021-11-18 at 12.48.25 PM Screen Shot 2021-11-18 at 12.48.32 PM
  • 13465 Views
  • 5 replies
  • 7 kudos
Latest Reply
Carv
New Contributor II
  • 7 kudos

For those stumbling across this; it seems LAST_VALUE emulates the same functionality as it does in SQL Server which does not, in most people's minds, have a proper row/range frame for the window. You can adjust it with the below syntax.I understand l...

  • 7 kudos
4 More Replies
Enzo_Bahrami
by New Contributor III
  • 727 Views
  • 0 replies
  • 0 kudos

Connect File Arrival Trigger to on-prem file server

Hello everyone!I was wondering if there is any way to connect File Arrival Trigger to an on-prem file server. Can I use JDBC or ODBC? will those connect to an on-prem file server (not a SQL server)Thank you

Data Engineering
File Arrival Trigger
  • 727 Views
  • 0 replies
  • 0 kudos
Volkan_Gumuskay
by New Contributor III
  • 8909 Views
  • 6 replies
  • 3 kudos

Resolved! Is there a way to run a single or selected lines in a notebook?

Assume we have a given cellprint('A') print('B') print('C')I want to run only the below line.print('B')Obviously, I can seperate the cell into three and run the one I want, but this is timely. This is a feature I use so often (e.g. in pycharm) and wo...

  • 8909 Views
  • 6 replies
  • 3 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 3 kudos

@Volkan_Gumuskay This is also available as an option in the notebook run options.

  • 3 kudos
5 More Replies
Hemant
by Valued Contributor II
  • 3807 Views
  • 2 replies
  • 3 kudos

Row_Num function in spark-sql

I have a doubt row_num with order by in spark-sql gives different result(non-deterministic output) every time i execute it?​It's due to parallelism in spark ?​​Any approach how to takle it?​I order by with a date column and a integer column and take...

  • 3807 Views
  • 2 replies
  • 3 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 3 kudos

@Hemant If the order by clause provided yields a unique result, then we would get deterministic output. For ex:If we create a rowID for this dataset, with CustomerID used in OrderBy clause, then depending upon the runtime, we may get non-deterministi...

  • 3 kudos
1 More Replies
alexiswl
by Contributor
  • 10666 Views
  • 3 replies
  • 0 kudos

Resolved! Merge Schema Error Message despite setting option to true

Has anyone come across this error before:```A schema mismatch detected when writing to the Delta table (Table ID: d4b9c839-af0b-4b62-aab5-1072d3a0fa9d). To enable schema migration using DataFrameWriter or DataStreamWriter, please set: '.option("merge...

  • 10666 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @alexiswl  Share the wisdom! By marking the best answers, you help others in our community find valuable information quickly and efficiently. Thanks!

  • 0 kudos
2 More Replies
Yogybricks
by New Contributor II
  • 2233 Views
  • 2 replies
  • 0 kudos
  • 2233 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Yogybricks  Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too. Cheers!

  • 0 kudos
1 More Replies
zsucic1
by New Contributor III
  • 4038 Views
  • 2 replies
  • 0 kudos

Resolved! Trigger file_arrival of job on Delta Lake table change

Is there a way to avoid having to create an external data location Simply to trigger a job when new data comes to a specific Delta Lake table?

  • 4038 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @zsucic1  Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too. Cheers!

  • 0 kudos
1 More Replies
KalingaSena
by New Contributor II
  • 4209 Views
  • 3 replies
  • 0 kudos

Not able to execute below SQL query in databricks notebook because of Pare error

Hi Team,I am unable to run the below command and it is giving me a parse error. Can any one point out the issue with the code:   

KalingaSena_1-1689140837096.png
  • 4209 Views
  • 3 replies
  • 0 kudos
Latest Reply
BkP
Contributor
  • 0 kudos

Hi,From the error , it looks like there is no space between the brackets and the "in" keyword after the where clause. Can you please try again see if you facing the same error.  

  • 0 kudos
2 More Replies
apiury
by New Contributor III
  • 2461 Views
  • 2 replies
  • 1 kudos

Consume gold data layer from web application

Hello!We are developing a web application in .NET, we need to consume data in gold layer, (as if we had a relational database), how can we do it? export data to sql server from gold layer?

  • 2461 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @apiury  Thank you for posting your question in our community! We are happy to assist you. To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your ...

  • 1 kudos
1 More Replies
NithinTiruveedh
by New Contributor II
  • 24140 Views
  • 12 replies
  • 0 kudos

How can I split a Spark Dataframe into n equal Dataframes (by rows)? I tried to add a Row ID column to acheive this but was unsuccessful.

I have a dataframe that has 5M rows. I need to split it up into 5 dataframes of ~1M rows each. This would be easy if I could create a column that contains Row ID. Is that possible?

  • 24140 Views
  • 12 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @NithinTiruveedh  Thank you for posting your question in our community! We are happy to assist you. To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

  • 0 kudos
11 More Replies
Anonymous
by Not applicable
  • 2253 Views
  • 2 replies
  • 3 kudos

Databricks streaming dataframe into Snowflake

Any suggestions on how to stream data from databricks into snowflake?. Is snowpipe is the only option?. Snowpipe is not faster since it runs copy into in a small batch intervals and not in few seconds. If no option other than snowpipe, how to call it...

  • 2253 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Anonymous  Hope everything is going great. Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

  • 3 kudos
1 More Replies
Gil
by New Contributor III
  • 6922 Views
  • 10 replies
  • 7 kudos

DLT optimize and vacuum

We were finally able to get DLT pipelines to run the optimize and vacuum automatically.  We verified this via the the table history.   However I am able to still query versions older than 7 days.   Has anyone been experiencing this and how were you a...

  • 6922 Views
  • 10 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Gil  Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.  We'd love to hear from you. Thanks!

  • 7 kudos
9 More Replies
Anonymous
by Not applicable
  • 2525 Views
  • 2 replies
  • 5 kudos

 Dear @Werner Stinckens​  and @Tyler Retzlaff​  We would like to express our gratitude for your participation and dedication in the Databricks Commun...

 Dear @Werner Stinckens​  and @Tyler Retzlaff​ We would like to express our gratitude for your participation and dedication in the Databricks Community last week. Your interactions with customers have been valuable and we truly appreciate the time...

Screenshot 2023-06-13 at 8.42.49 PM
  • 2525 Views
  • 2 replies
  • 5 kudos
Latest Reply
dplante
Contributor II
  • 5 kudos

Congratulations guys!

  • 5 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels