cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Vik1
by New Contributor II
  • 2362 Views
  • 4 replies
  • 2 kudos

Resolved! Cluster setup for ML work for Pandas in Spark, and vanilla Python.

My setup:Worker type: Standard_D32d_v4, 128 GB Memory, 32 Cores, Min Workers: 2, Max Workers: 8Driver type: Standard_D32ds_v4, 128 GB Memory, 32 CoresDatabricks Runtime Version: 10.2 ML (includes Apache Spark 3.2.0, Scala 2.12)I ran a snowflake quer...

  • 2362 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey there @Vivek Ranjan​ Checking in. If Joseph's answer helped, would you let us know and mark the answer as best?  It would be really helpful for the other members to find the solution more quickly.Thanks!

  • 2 kudos
3 More Replies
self-employed
by Contributor
  • 1014 Views
  • 3 replies
  • 3 kudos

Resolved! Is the machine learning part of "Apache Sparkâ„¢ Tutorial: Getting Started with Apache Spark on Databricks" missing or no longer available?

I am following the Apache Sparkâ„¢ Tutorial. When I finish the data set part and want to continue the machine learning part. I found the page is empty. The next section after machine learning is fine. So I guess there must be a url mismatching.The url ...

  • 1014 Views
  • 3 replies
  • 3 kudos
Latest Reply
self-employed
Contributor
  • 3 kudos

I clean the cookie and then the link recovers. So it is an issue about cookie.

  • 3 kudos
2 More Replies
Joseph_B
by New Contributor III
  • 608 Views
  • 0 replies
  • 1 kudos

mlflow.org

2021-09 webinar: Automating the ML Lifecycle With Databricks Machine Learning (Post 2 of 2)Thank you to everyone who joined! You can access the on-demand recording here and the code in this Github repo.We're sharing a subset of the questions asked an...

  • 608 Views
  • 0 replies
  • 1 kudos
Joseph_B
by New Contributor III
  • 455 Views
  • 0 replies
  • 1 kudos

docs.databricks.com

2021-09 webinar: Automating the ML Lifecycle With Databricks Machine Learning (post 1 of 2)Thank you to everyone who joined the Automating the ML Lifecycle With Databricks Machine Learning webinar! You can access the on-demand recording here and the ...

  • 455 Views
  • 0 replies
  • 1 kudos
User16752240150
by New Contributor II
  • 2010 Views
  • 1 replies
  • 0 kudos

What's the best way to implement long term data versioning?

I'm a data scientist creating versioned ML models. For compliance reasons, I need to be able to replicate the training data for each model version. I've seen that you can version datasets by using delta, but the default retention period is around 30 ...

  • 2010 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

Delta, as you mentioned has a feature to do time travel and by default, delta tables retain the commit history for 30 days. Operations on history of the table are parallel but will become more expensive as the log size increasesNow, in this case - s...

  • 0 kudos
User16752239203
by New Contributor
  • 577 Views
  • 1 replies
  • 0 kudos

How can I use Non- Spark related libraries like spacy with Databricks and Spark

I have an NLP application that I build on my local machine using spacy and pandas, but now I would like to scale my application to a large production dataset and utilize the benefits of sparks distributed compute. How do I import and utilize a librar...

  • 577 Views
  • 1 replies
  • 0 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 0 kudos

It depends on what you mean, but if you're just trying to (say) tokenize and process data with spacy in parallel, then that's trivial. Write a 'pandas UDF' function that expresses how you want to transform data using spacy, in terms of a pandas DataF...

  • 0 kudos
User16826994223
by Honored Contributor III
  • 325 Views
  • 0 replies
  • 0 kudos

Databricks Certified Professional Data Scientist  Does this exam require Databricks-specific or Spark-specific knowledge?No. Test-takers will be asse...

Databricks Certified Professional Data Scientist Does this exam require Databricks-specific or Spark-specific knowledge?No. Test-takers will be assessed on their understanding of the basics of machine learning and data science, how to complete each ...

  • 325 Views
  • 0 replies
  • 0 kudos
Joseph_B
by New Contributor III
  • 1666 Views
  • 1 replies
  • 1 kudos
  • 1666 Views
  • 1 replies
  • 1 kudos
Latest Reply
Joseph_B
New Contributor III
  • 1 kudos

You can find a lot more info on this at this MLflow product page, including a comparison table at the bottom. I'd summarize that comparison as: Databricks provides three key things in its managed MLflow service.Security: MLflow experiments, models, ...

  • 1 kudos
Joseph_B
by New Contributor III
  • 1776 Views
  • 1 replies
  • 0 kudos
  • 1776 Views
  • 1 replies
  • 0 kudos
Latest Reply
Joseph_B
New Contributor III
  • 0 kudos

You can find the MLflow version in the runtime release notes, along with a list of every other library provided. E.g., for DBR 8.3 ML, you can look at the release notes for AWS, Azure, or GCP.The MLflow client API (i.e., the API provided by installi...

  • 0 kudos
User16788317466
by New Contributor II
  • 1093 Views
  • 2 replies
  • 0 kudos

How do I efficiently read image data for a deep learning model?

How do I efficiently read image data for a deep learning model?

  • 1093 Views
  • 2 replies
  • 0 kudos
Latest Reply
Joseph_B
New Contributor III
  • 0 kudos

Our documentation provides nice examples of preparing image data for training and inference.Training: See docs for AWS, Azure, GCPInference: See reference solution for AWS, Azure, GCP

  • 0 kudos
1 More Replies
Labels