cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Pbarbosa154
by New Contributor III
  • 673 Views
  • 2 replies
  • 0 kudos

What is the best way to ingest GCS data into Databricks and apply Anomaly Detection Model?

I recently started exploring the field of Data Engineering and came across some difficulties. I have a bucket in GCS with millions of parquet files and I want to create an Anomaly Detection model with them. I was trying to ingest that data into Datab...

  • 673 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Pedro Barbosa​ :It seems like you are running out of memory when trying to convert the PySpark dataframe to an H2O frame. One possible approach to solve this issue is to partition the PySpark dataframe before converting it to an H2O frame.You can us...

  • 0 kudos
1 More Replies
vas610
by New Contributor III
  • 1805 Views
  • 5 replies
  • 0 kudos

Error loading h2o model in mlflow

I'm getting the following error when I'm trying to load a h2o model using mlflow for prediction Error: Error Job with key $03017f00000132d4ffffffff$_990da74b0db027b33cc49d1d90934149 failed with an exception: java.lang.IllegalArgumentException:...

  • 1805 Views
  • 5 replies
  • 0 kudos
Latest Reply
Dan_Z
Honored Contributor
  • 0 kudos

I ran this in Databricks and it worked with no issues. I suggest you make sure your wget path is correct, because the one you posted downloads HTML, not the raw csv. That may cause the problem. %sh wget https://raw.githubusercontent.com/mlflow/mlflo...

  • 0 kudos
4 More Replies
Labels