cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

wpenfold
by New Contributor II
  • 31153 Views
  • 5 replies
  • 2 kudos
  • 31153 Views
  • 5 replies
  • 2 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 2 kudos

Using workspace API you can list out all the notebooks for a given user.The API response will tell you if the objects under the path is a folder or a notebook. If it's a folder then you can add it to the path and get notebooks within the folder.Put a...

  • 2 kudos
4 More Replies
bluetail
by Contributor
  • 15626 Views
  • 6 replies
  • 5 kudos

Resolved! ModuleNotFoundError: No module named 'mlflow' when running a notebook

I am running a notebook on the Coursera platform.my configuration file, Classroom-Setup, looks like this:%python   spark.conf.set("com.databricks.training.module-name", "deep-learning") spark.conf.set("com.databricks.training.expected-dbr", "6.4")   ...

  • 15626 Views
  • 6 replies
  • 5 kudos
Latest Reply
User16753724663
Valued Contributor
  • 5 kudos

Hi @Maria Bruevich​ ,From the error description, it looks like the mlflow library is not present. You can use ML cluster as these type of cluster already have mlflow library. Please check the below document:https://docs.databricks.com/release-notes/r...

  • 5 kudos
5 More Replies
User16752239289
by Databricks Employee
  • 3570 Views
  • 1 replies
  • 1 kudos

Resolved! SparkR session failed to initialize

When run sparkR.session()I faced below error:Spark package found in SPARK_HOME: /databricks/spark   Launching java with spark-submit command /databricks/spark/bin/spark-submit sparkr-shell /tmp/Rtmp5hnW8G/backend_porte9141208532d   Error: Could not f...

  • 3570 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16752239289
Databricks Employee
  • 1 kudos

This is due to the when users run their R scripts on Rstudio, the R session is not shut down gracefully. Databricks is working on handle the R session better and removed the limit. As a workaround, you can create and run below init script to increase...

  • 1 kudos
brickster_2018
by Databricks Employee
  • 2661 Views
  • 1 replies
  • 1 kudos

Resolved! How to run commands on the executor

Using %sh, I am able to run commands on the notebook and get output. How can i run a command on the executor and get the output. I want to avoid using the Spark API's

  • 2661 Views
  • 1 replies
  • 1 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 1 kudos

It's not possible to use %sh to run commands on the executor. The below code can be used to run commands on the executor and get the outputvar res=sc.runOnEachExecutor[String]({ () => import sys.process._ var cmd_Result=Seq("bash", "-c", "h...

  • 1 kudos
brickster_2018
by Databricks Employee
  • 1621 Views
  • 2 replies
  • 0 kudos

Resolved! Unable to run any commands on the cluster.

All the commands get canceled. even 1+1 is failing, the cluster is completely unusable.

  • 1621 Views
  • 2 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

More details on similar issues here: https://kb.databricks.com/python/python-command-cancelled.html

  • 0 kudos
1 More Replies
User16826994223
by Honored Contributor III
  • 1311 Views
  • 1 replies
  • 0 kudos

Even the Unfinished Experiment in Mlflow is getting saved as finished

when I start the experiment with mlflow.start_run(),even if my script is interrupted or failed before executing mlflow.end_run() ,the run gets tagged as finished instead of unfinished , Can any one help why it is happening here

  • 1311 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

In note book the mlflow tagas ias the command travels and once failed or exit there itself it logs and finishes the experiment even if the noteboolsfails. However, if you want to continue logging metrics or artifacts to that run, you just need to use...

  • 0 kudos
aladda
by Databricks Employee
  • 19401 Views
  • 2 replies
  • 0 kudos
  • 19401 Views
  • 2 replies
  • 0 kudos
Latest Reply
aladda
Databricks Employee
  • 0 kudos

%run is copying code from another notebook and executing it within the one its called from. All variables defined in the notebook being called are therefore visible to the caller notebook dbutils.notebook.run() is more around executing different note...

  • 0 kudos
1 More Replies
Anonymous
by Not applicable
  • 1281 Views
  • 1 replies
  • 0 kudos
  • 1281 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16783855117
Contributor II
  • 0 kudos

It really depends on your business intentions! You can remove files no longer referenced by a Delta table and are older than the retention threshold by running the vacuum command on the table. vacuum is not triggered automatically. The default retent...

  • 0 kudos
Labels