cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Anonymous
by Not applicable
  • 893 Views
  • 1 replies
  • 7 kudos

Train machine learning models: How can I take my ML lifecycle from experimentation to production?

Note: the following guide is primarily for Python users. For other languages, please view the following links: • Table batch reads and writes • Create a table in SQL • Visualizing data with DBSQLThis step-by-step guide will get your data...

Image Image Image Image
  • 893 Views
  • 1 replies
  • 7 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 7 kudos

I got good knowledge by your post . It is very clear . Thank you . Keep sharing like this posts .It will be helpful

  • 7 kudos
Gilg
by Contributor II
  • 4654 Views
  • 1 replies
  • 0 kudos

Failed to add 1 container to the cluster. will attempt retry: false. reason: bootstrap timeout

Hi Team,When creating a new cluster in a workspace within a VNET receiving this error:Failed to add 1 container to the cluster. will attempt retry: false. reason: bootstrap timeoutCluster terminated. Reason: Bootstrap TimeoutCheers.Gil

  • 4654 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Gil Gonong​ :The error message you are receiving suggests that the creation of the new cluster has failed due to a bootstrap timeout. The bootstrap process is responsible for setting up the initial configuration of the cluster, and if it takes too l...

  • 0 kudos
fa
by New Contributor III
  • 2122 Views
  • 6 replies
  • 7 kudos

How are dashboards served and what would happen to them if the cluster attached to the notebook terminates?

I have two dashboards in presentation mode both from notebooks being run on the same compute cluster. Last night the cluster terminated due to idle time and in the morning one of my dashboards was fine but the other one was set to the default stab di...

  • 2122 Views
  • 6 replies
  • 7 kudos
Latest Reply
Manoj12421
Valued Contributor II
  • 7 kudos

​If your query were scheduled, it's automatically started the cluster at the scheduled time Or might be possible that the portion that is still visible doesn't need to be generated so it looks like it's working but it is just left over from the prior...

  • 7 kudos
5 More Replies
rubenteixeira
by New Contributor III
  • 2451 Views
  • 4 replies
  • 1 kudos

Permission denied: Lightning Logs

I'm doing parameter tuning for a NeuralProphet model (you can see in the image the parameters and code for training)When I try to parallelize the training, it gives me Permission Error.Why can't I access the folder '/databricks/spark/work/*'? Do I ne...

altri1 MicrosoftTeams-image
  • 2451 Views
  • 4 replies
  • 1 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 1 kudos

Hi, Could you please check on cluster-level permissions and let us know if it helps? Please refer: https://docs.databricks.com/security/access-control/cluster-acl.html#cluster-level-permissions

  • 1 kudos
3 More Replies
llvu
by New Contributor III
  • 1601 Views
  • 4 replies
  • 2 kudos

How to solve cluster break down due to GC when training a pyspark.ml Random Forest

I am trying to train and optimize a random forest. At first the cluster handles the garbage collection fine, but after a couple of hours the cluster breaks down as Garbage Collection has gone up significantly.The train_df has a size of 6,365,018 reco...

  • 1601 Views
  • 4 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Liselotte van Unen​(Customer)​ , We haven’t heard from you since the last response from @Hubert Dudek​, and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please do share that with the community as it ...

  • 2 kudos
3 More Replies
Somi
by New Contributor III
  • 708 Views
  • 3 replies
  • 0 kudos

No saved model after stopping the cluster.

I have saved a keras model in some directories in dbfs to load and retrain that with more data, etc. The problem is that when cluster stops and restarts, seems those directories and model are no longer available there and it starts training a new mod...

  • 708 Views
  • 3 replies
  • 0 kudos
Latest Reply
Somi
New Contributor III
  • 0 kudos

Hi @Vidula Khanna​ I figured it out by replacing OS library module with dbutils utilities. It looks like mre compatible with DBFS.

  • 0 kudos
2 More Replies
Vik1
by New Contributor II
  • 2455 Views
  • 4 replies
  • 2 kudos

Resolved! Cluster setup for ML work for Pandas in Spark, and vanilla Python.

My setup:Worker type: Standard_D32d_v4, 128 GB Memory, 32 Cores, Min Workers: 2, Max Workers: 8Driver type: Standard_D32ds_v4, 128 GB Memory, 32 CoresDatabricks Runtime Version: 10.2 ML (includes Apache Spark 3.2.0, Scala 2.12)I ran a snowflake quer...

  • 2455 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey there @Vivek Ranjan​ Checking in. If Joseph's answer helped, would you let us know and mark the answer as best?  It would be really helpful for the other members to find the solution more quickly.Thanks!

  • 2 kudos
3 More Replies
User16826988699
by New Contributor
  • 15286 Views
  • 2 replies
  • 2 kudos

Resolved! Problem with spinning up a cluster on a new workspace

Error: Please check network connectivity from the data plane to the control plane.{ "reason": {   "code": "BOOTSTRAP_TIMEOUT",   "parameters": {     "databricks_error_message": "[id: InstanceId(i-0457092c), status: INSTANCE_INITIALIZING, workerEnvId:...

  • 15286 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16725394280
Contributor II
  • 2 kudos

Can you please get the system logs from AWS EC2 console as soon the cluster fails - System Logs for the failed instance will be accessible from the AWS console up to an hour after the shutdown.AWS console clears the references of terminated clusters ...

  • 2 kudos
1 More Replies
User16826990884
by New Contributor III
  • 791 Views
  • 1 replies
  • 0 kudos

Rollback cluster changes

Is it possible to rollback changes made to a cluster? The problem I'm trying to solve is to recover from an accidental change made by a user on a cluster that affects interactive and job runs. Cluster policies help, but the policy still provides the ...

  • 791 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

You could look at automating cluster creation steps and implementing this with an infra-as-code solution like the databricks terraform provider which allows rollback

  • 0 kudos
User16789201666
by Contributor II
  • 1503 Views
  • 4 replies
  • 0 kudos

How do you control the cost of provisioning a cluster?

How do you govern the cost of running clusters in Databricks so you're not sticker shocked?

  • 1503 Views
  • 4 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Less use of Interactive cluster and more use of job cluster can one of the way above others

  • 0 kudos
3 More Replies
Labels