Topics with Label: Machine Learning

by Vik1 • New Contributor II

01-21-2022 9:16:42 AM

2362 Views
4 replies
2 kudos

Resolved! Cluster setup for ML work for Pandas in Spark, and vanilla Python.

My setup:Worker type: Standard_D32d_v4, 128 GB Memory, 32 Cores, Min Workers: 2, Max Workers: 8Driver type: Standard_D32ds_v4, 128 GB Memory, 32 CoresDatabricks Runtime Version: 10.2 ML (includes Apache Spark 3.2.0, Scala 2.12)I ran a snowflake quer...

Machine Learning

Reply

2362 Views
4 replies
2 kudos

01-21-2022 9:16:42 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-22-2022 7:23:05 AM

2 kudos

Hey there @Vivek Ranjan Checking in. If Joseph's answer helped, would you let us know and mark the answer as best? It would be really helpful for the other members to find the solution more quickly.Thanks!

2 kudos

04-22-2022 7:23:05 AM

3 More Replies

by self-employed • Contributor

01-04-2022 9:55:08 PM

1014 Views
3 replies
3 kudos

Resolved! Is the machine learning part of "Apache Spark™ Tutorial: Getting Started with Apache Spark on Databricks" missing or no longer available?

I am following the Apache Spark™ Tutorial. When I finish the data set part and want to continue the machine learning part. I found the page is empty. The next section after machine learning is fine. So I guess there must be a url mismatching.The url ...

Machine Learning

Reply

1014 Views
3 replies
3 kudos

01-04-2022 9:55:08 PM

View Replies

Latest Reply

self-employed
Contributor

01-05-2022 11:26:26 PM

3 kudos

I clean the cookie and then the link recovers. So it is an issue about cookie.

3 kudos

01-05-2022 11:26:26 PM

2 More Replies

by Joseph_B • New Contributor III

10-08-2021 9:09:38 AM

608 Views
0 replies
1 kudos

mlflow.org

2021-09 webinar: Automating the ML Lifecycle With Databricks Machine Learning (Post 2 of 2)Thank you to everyone who joined! You can access the on-demand recording here and the code in this Github repo.We're sharing a subset of the questions asked an...

Machine Learning

Reply

608 Views
0 replies
1 kudos

10-08-2021 9:09:38 AM

by Joseph_B • New Contributor III

10-08-2021 9:05:02 AM

455 Views
0 replies
1 kudos

docs.databricks.com

2021-09 webinar: Automating the ML Lifecycle With Databricks Machine Learning (post 1 of 2)Thank you to everyone who joined the Automating the ML Lifecycle With Databricks Machine Learning webinar! You can access the on-demand recording here and the ...

Machine Learning

Reply

455 Views
0 replies
1 kudos

10-08-2021 9:05:02 AM

by User16752240150 • New Contributor II

06-04-2021 11:47:11 AM

2010 Views
1 replies
0 kudos

What's the best way to implement long term data versioning?

I'm a data scientist creating versioned ML models. For compliance reasons, I need to be able to replicate the training data for each model version. I've seen that you can version datasets by using delta, but the default retention period is around 30 ...

Machine Learning

Reply

2010 Views
1 replies
0 kudos

06-04-2021 11:47:11 AM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-17-2021 10:36:52 PM

0 kudos

Delta, as you mentioned has a feature to do time travel and by default, delta tables retain the commit history for 30 days. Operations on history of the table are parallel but will become more expensive as the log size increasesNow, in this case - s...

0 kudos

06-17-2021 10:36:52 PM

by User16752239203 • New Contributor

06-11-2021 11:55:55 AM

577 Views
1 replies
0 kudos

How can I use Non- Spark related libraries like spacy with Databricks and Spark

I have an NLP application that I build on my local machine using spacy and pandas, but now I would like to scale my application to a large production dataset and utilize the benefits of sparks distributed compute. How do I import and utilize a librar...

Machine Learning

Reply

577 Views
1 replies
0 kudos

06-11-2021 11:55:55 AM

View Replies

Latest Reply

sean_owen
Honored Contributor II

06-17-2021 4:23:53 PM

0 kudos

It depends on what you mean, but if you're just trying to (say) tokenize and process data with spacy in parallel, then that's trivial. Write a 'pandas UDF' function that expresses how you want to transform data using spacy, in terms of a pandas DataF...

0 kudos

06-17-2021 4:23:53 PM

by User16826994223 • Honored Contributor III

06-17-2021 1:48:38 AM

325 Views
0 replies
0 kudos

Databricks Certified Professional Data Scientist Does this exam require Databricks-specific or Spark-specific knowledge?No. Test-takers will be asse...

Databricks Certified Professional Data Scientist Does this exam require Databricks-specific or Spark-specific knowledge?No. Test-takers will be assessed on their understanding of the basics of machine learning and data science, how to complete each ...

Machine Learning

Reply

325 Views
0 replies
0 kudos

06-17-2021 1:48:38 AM

by Joseph_B • New Contributor III

06-14-2021 2:38:56 PM

1666 Views
1 replies
1 kudos

How does Databricks managed MLflow compare with open-source (OSS) MLflow?

Machine Learning

Reply

1666 Views
1 replies
1 kudos

06-14-2021 2:38:56 PM

View Replies

Latest Reply

Joseph_B
New Contributor III

06-14-2021 2:44:00 PM

1 kudos

You can find a lot more info on this at this MLflow product page, including a comparison table at the bottom. I'd summarize that comparison as: Databricks provides three key things in its managed MLflow service.Security: MLflow experiments, models, ...

1 kudos

06-14-2021 2:44:00 PM

by Joseph_B • New Contributor III

06-09-2021 6:07:30 PM

1776 Views
1 replies
0 kudos

How can I find out what version of MLflow is in a Databricks Runtime for ML? Is it the same as the open source MLflow?

Machine Learning

Reply

1776 Views
1 replies
0 kudos

06-09-2021 6:07:30 PM

View Replies

Latest Reply

Joseph_B
New Contributor III

06-09-2021 6:12:36 PM

0 kudos

You can find the MLflow version in the runtime release notes, along with a list of every other library provided. E.g., for DBR 8.3 ML, you can look at the release notes for AWS, Azure, or GCP.The MLflow client API (i.e., the API provided by installi...

0 kudos

06-09-2021 6:12:36 PM

by User16788317466 • New Contributor II

06-07-2021 11:13:30 AM

1093 Views
2 replies
0 kudos

How do I efficiently read image data for a deep learning model?

Machine Learning

Reply

1093 Views
2 replies
0 kudos

06-07-2021 11:13:30 AM

View Replies

Latest Reply

Joseph_B
New Contributor III

06-08-2021 12:46:46 PM

0 kudos

Our documentation provides nice examples of preparing image data for training and inference.Training: See docs for AWS, Azure, GCPInference: See reference solution for AWS, Azure, GCP

0 kudos

06-08-2021 12:46:46 PM

1 More Replies