cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

TylerTamasaucka
by New Contributor
  • 23800 Views
  • 4 replies
  • 0 kudos

org.apache.spark.sql.AnalysisException: Undefined function: 'MAX'

I am trying to create a JAR for a Azure Databricks job but some code that works when using the notebook interface does not work when calling the library through a job. The weird part is that the job will complete the first run successfully but on an...

  • 23800 Views
  • 4 replies
  • 0 kudos
Latest Reply
skaja
New Contributor II
  • 0 kudos

I am facing similar issue when trying to use from_utc_timestamp function. I am able to call the function from databricks notebook but when I use the same function inside my java jar and running as a job in databricks, it is giving below error. Analys...

  • 0 kudos
3 More Replies
osoucy
by New Contributor II
  • 496 Views
  • 0 replies
  • 1 kudos

Is it possible to join two aggregated streams of data?

ObjectiveWithin the context of a delta live table, I'm trying to merge two streams aggregation, but run into challenges. Is it possible to achieve such a join?ContextAssume- table trades stores a list of trades with their associated time stamps- tabl...

  • 496 Views
  • 0 replies
  • 1 kudos
KKDataEngineer
by New Contributor III
  • 648 Views
  • 0 replies
  • 2 kudos

Spark Structred Streaming, An Aggregation DF with Watermark in Append mode to Delta table is not writing the most recent aggregation to the Delta table even after crossing the water mark boundary. This is causing dataloss

Team,  I am struggling with a unique issue. I am not sure if my understanding is wrong or this is a bug with spark. I am reading a stream from events hub ( Extract) Pivoting and Aggregating the above dataframe ( Transformation). This is a WATERMARKED...

  • 648 Views
  • 0 replies
  • 2 kudos
FrancisLau
by New Contributor
  • 2159 Views
  • 2 replies
  • 0 kudos

Resolved! agg function not working for multiple aggregations

Data has 2 columns: |requestDate|requestDuration| | 2015-06-17| 104| Here is the code: avgSaveTimesByDate = gridSaves.groupBy(gridSaves.requestDate).agg({"requestDuration": "min", "requestDuration": "max","requestDuration": "avg"}) avgSaveTimesBy...

  • 2159 Views
  • 2 replies
  • 0 kudos
Latest Reply
ReKa
New Contributor III
  • 0 kudos

My guess is that the reason this may not work is the fact that the dictionary input does not have unique keys. With this syntax, column-names are keys and if you have two or more aggregation for the same column, some internal loops may forget the no...

  • 0 kudos
1 More Replies
Labels