Where to find documentation about : spark.databricks.driver.strace.enabled
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-17-2022 07:06 AM
Hello ,
For a support request, Microsoft support ask me to add
spark.databricks.driver.strace.enabled true
to my cluster configuration.
MS was not able to send me a link to the documentation and I did not find it on the databricks website.
Can someone help me to find documentation about this parameter ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-17-2022 11:49 PM
Hi @oliv vier , could you please give us a little context on what Microsoft has mentioned to configure the spark configuration?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2022 12:59 AM
Yes no problem.
I have a python program, called "post ingestion", that run on a databricks job cluster during the night and consist of :
- inserting data to a deltalake table
- executing an optimize command on that table
- executing a vacuum command on that table
- And then I use dbutils command to copy the folder containing data of this delta table to another folder (I dispatch data for a lab and a qal databricks workspace)
Sometimes the copy failed with a 404 error :
java.io.FileNotFoundException: Operation failed: 'The specified path does not exist.', 404, GET, https://icmfcprddls001.dfs.core.windows.net/prd/curated/common/AS400/AS400.BC300.FT3RCPV/_delta_log/..., PathNotFound, 'The specified path does not exist. RequestId:e31cc75e-b01f-0042-1858-c23924000000 Time:2022-09-07T01:26:06.8939739Z
When the error occurs, no other program is using the delta table
At the morning when I re-run the copy, everything run fine.
They ask me to add this parameter to get more detail about sparks operations. But I want to know exactly what this parameter do, and MS was not able to give me more informations
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-03-2022 01:44 AM
I would not use dbutils in production as they use just only one core of driver. Instead, why not execute Azure Data Factory by triggering it and it offers gigantic throughput, and it will be easier to analyze copy results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-16-2022 01:55 AM
One core is not a problem for me and I do not want to stack services.
Before changing all our architecture, I will try to find a solution

