Databricks Community

Dean_Lovelace · 05-17-2023

As part of my batch processing I archive a large number of small files received from the source system each day using the dbutils.fs.mv command. This takes hours as dbutils.fs.mv moves the files one at a time.How can I speed this up?

Dean_Lovelace · 04-26-2023

When running spark under yarn each script has it's own self contained set of logs:- In databricks all I see if a list of jobs and stages that have been run on the cluster:- From a support perspective this is a nightmare.How can notebooks logs be grou...

Dean_Lovelace · 04-17-2023

I am using the delta format and occasionaly get the following error:-"xx.parquet referenced in the transaction log cannot be found. This occurs when data has been manually deleted from the file system rather than using the table `DELETE` statement"FS...

Dean_Lovelace · 04-17-2023

I have have started getting an error message when running the following optimize command:-deltaTable.optimize().executeCompaction()Error:-java.util.concurrent.ExecutionException: java.lang.IllegalStateException: Number of records changed after Optimi...

Dean_Lovelace · 04-05-2023

I have created a number of workflows in the Databricks UI. I now need to deploy them to a different workspace.How can I do that?Code can be deployed via Git, but the job definitions are stored in the workspace only.

Dean_Lovelace · 05-09-2023

Steps 4-9 don't work for me. I don't see any "Application ID" link or any way of filtering in the spark UI.I am using Databricks in Azure.

Dean_Lovelace · 04-19-2023

How can I perform a full compaction?

Dean_Lovelace · 04-06-2023

The workflows are 'integrated' with git, as in they get the code from a git branch each time they run. I am not seeing an option to store the actual workflow definitions in git (task name, script path, dependencies, retries, schedule etc). I have a w...

Dean_Lovelace · 04-06-2023

The only way I can find to move workflow jobs (schedules) to another workspace is:-1) Create a job in the databricks UI (Workflows -> Jobs -> Create Job)2) Copy the json definition ("...", View JSON, Create, Copy)3) Save the json locally or in the Gi...

Dean_Lovelace · 04-06-2023

I can't see anything in this link related to moving jobs (schedules) between environments. I have git integration, but workflow jobs are not stored in git.

Databricks Community

User Stats

User Activity

Efficiently move multiple files with dbutils.fs.mv command on abfs storage

How to filter the Spark UI for a notebook

What is the Pyspark equivalent of FSCK REPAIR TABLE?

Delta Table Optimize Error

How can I deploy workflow jobs to another databricks workspace?

Re: How to filter the Spark UI for a notebook

Re: Delta Table Optimize Error

Re: How can I deploy workflow jobs to another databricks workspace?

Re: How can I deploy workflow jobs to another databricks workspace?

Re: How can I deploy workflow jobs to another databricks workspace?