Databricks Community

Merchiv · 04-14-2023

I have noticed some inconsistent behavior between calling the 'split' fuction on databricks and on my local installation.Running it in a databricks notebook givesspark.sql("SELECT split('abc', ''), size(split('abc',''))").show()So the string is split...

Merchiv · 03-28-2023

When running some SQL queries using spark.sql(...), we sometimes get a variant of the following error:AnalysisException: Undefined function: current_timestamp. This function is neither a built-in/temporary function, nor a persistent function that is ...

Merchiv · 02-03-2023

Let's say I have a DataFrame with a timestamp and an offset column in milliseconds respectively in the timestamp and long format. E.g.from datetime import datetime df = spark.createDataFrame( [ (datetime(2021, 1, 1), 1500, ), (dat...

Merchiv · 11-30-2022

I have a Merge into statement that I use to update existing entries or create new entries in a dimension table based on a natural business key.When creating new entries I would like to also create a unique uuid for that entry that I can use to crossr...

Merchiv · 09-06-2023

Is there a way to resolve this issue without using ml clusters? Due to our current setup, I'm currently limited in which clusters I can manually create and a quick workaround for development purposes would be helpful here.

Merchiv · 06-15-2023

That was also the suggestion from databricks support that helped in our case.

Merchiv · 04-17-2023

Thank you for the suggestion, but even with the same spark version there seems to be a difference between what is happening locally and what happens on a databricks cluster.

Merchiv · 04-17-2023

Hi,My databricks cluster runs spark 3.3, but does give a length of 3.Is there something different about the databricks implementation of pyspark or should it use the same standards?

Merchiv · 04-10-2023

Thanks, I opened a ticket and I'll update when I have a response.

Databricks Community

User Stats

User Activity

Difference between Databricks and local pyspark split.

AnalysisException when running SQL queries

How can I add a duration in milliseconds to a timestamp?

How to use uuid in SQL merge into statement

Re: MLFlow failed: You haven't configured the CLI yet

Re: AnalysisException when running SQL queries

Re: Difference between Databricks and local pyspark split.

Re: Difference between Databricks and local pyspark split.

Re: AnalysisException when running SQL queries