- 4109 Views
- 5 replies
- 5 kudos
I have a pyfunc model that I can use to get predictions. It takes time series data with context information at each date, and produces a string of predictions. For example:The data is set up like below (temp/pressure/output are different than my inpu...
- 4109 Views
- 5 replies
- 5 kudos
Latest Reply
I have the same question. I've decided to look for alternative Feature Stores as this makes it very difficult to use for time series forecasting.
4 More Replies
- 1335 Views
- 2 replies
- 0 kudos
Hi All,I'm working on creating a data quality dashboard. I've created few rules like checking nulls in a column, checking for data type of the column , removing duplicates etc.We follow medallion architecture and are applying these data quality check...
- 1335 Views
- 2 replies
- 0 kudos
Latest Reply
Hi @Sridhar Varanasi Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...
1 More Replies
- 3267 Views
- 3 replies
- 0 kudos
- 3267 Views
- 3 replies
- 0 kudos
Latest Reply
@Santhanalakshmi Manoharan Was this issue resolved, Am also getting same error, any guidance would be of great help.Appreciate your help.
2 More Replies
by
Orianh
• Valued Contributor II
- 1650 Views
- 2 replies
- 0 kudos
Hey guys, I'm training a TF model in databricks, and logging to tensorboard using SummaryWriter. At the end of each epoch SummaryWriter.flush() is called which should send any buffered data into storage. But i can't see the tensorboard files while th...
- 1650 Views
- 2 replies
- 0 kudos
Latest Reply
Hi @orian hindi Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...
1 More Replies
by
Kaan
• New Contributor
- 3006 Views
- 1 replies
- 1 kudos
I'm looking for a good product to use across two clouds at once for Data Engineering, Data modeling and governance. I currently have a GCP platform, but most of my data and future data goes through Azure, and currently is then transfered to GCS/BQ.Cu...
- 3006 Views
- 1 replies
- 1 kudos
Latest Reply
@Karl Andrén :Databricks is a great option for data engineering, data modeling, and governance across multiple clouds. It supports integrations with multiple cloud providers, including Azure, AWS, and GCP, and provides a unified interface to access ...
- 823 Views
- 0 replies
- 4 kudos
Hello Databricks Community! We are getting really excited about the upcoming event of the year Data & AI Summit 2023!The world’s largest data, analytics and AI conference returns live, to San Francisco and virtually. Four days (June 26–29, 2023) pack...
- 823 Views
- 0 replies
- 4 kudos
- 6430 Views
- 3 replies
- 15 kudos
Hi,I wonder that I should do OHE before or after I split data to build up a ML model.Please give some advise.
- 6430 Views
- 3 replies
- 15 kudos
Latest Reply
Hi @Nhat Hoang ,While not Databricks-specific, here's a good answer:"If you perform the encoding before the split, it will lead to data leakage (train-test contamination). In this sense, you will introduce new data (integers of Label Encoders) and u...
2 More Replies