- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-06-2021 09:29 AM
When you use:
from pyspark import SparkFiles
spark.sparkContext.addFile(url)
it adds file to NON dbfs /local_disk0/ but then when you want to read file:
spark.read.json(SparkFiles.get("file_name"))
it wants to read it from /dbfs/local_disk0/. I tried also with file:// and many other creative ways and it doesn't work.
Of course it is working after using %sh cp - moving from /local_disk0/ to /dbfs/local_disk0/ .
It seems to be a bug like addFile was switched to dbfs on azure databricks but SparkFiles not (in original spark it addFile and gets to/from workers).
I couldn't find also any settings to manually specify RootDirectory for SparkFiles.
- Labels:
-
Azure
-
Azure databricks
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ03-14-2022 08:20 AM
@Hubert Dudekโ
Have to tried with file:/// ?
I remember starting Spark 3.2, it honors the native hadoop file system if no file access protocol is defined.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-13-2021 05:46 AM
Hello.
I'm in the same situation. Data extraction via API using sparkfiles in Community Databricks runs without error, however in Azure it generates the mentioned error.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-13-2021 11:50 AM
In Azure it generates the mentioned error too
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-13-2021 11:53 AM
@Kaniz Fatmaโ @Piper Wilsonโ can you help to escalate that issue, as more people are complaining about that
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-13-2021 12:07 PM
Hello everyone
This problem to be happening with me too, in Azure. If somebody to can help us
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-13-2021 01:43 PM
@Hubert Dudekโ - You got it!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-14-2021 03:59 AM
Hi, I'm new here and I have some doubts. Will the bug fix be attended to only if there are votes, comments and views?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-14-2021 04:52 AM
someone should get back to us
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-14-2021 11:34 AM
@Kaniz Fatma (Databricks) @Piper (Customer)
Hi how are you?
Does this problem have a solution option?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-14-2021 12:05 PM
@Prabakar Ammeappinโ @Werner Stinckensโ @Jose Gonzalezโ maybe you could look as well to that issue ๐
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-14-2021 12:59 PM
@Hubert Dudekโ, @Dev Johnโ, @Marcos Goisโ, @Jorge Fernandesโ, and @welder martinsโ - Are you able to open a support ticket here - https://help.databricks.com/s/contact-us?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-14-2021 02:15 PM
ok, pity it can't be solved around here. But the ticket was opened, I give news of the progress.
thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-15-2021 10:54 AM
@welder martinsโ - Thank you for opening the ticket. We want to cover all our bases.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ12-28-2021 02:53 AM
Yes this solution was already discussed on stackoverflow. Problem is that this spark functionality should be adjusted in DBR to handle everything automatically via dbfs. Problem is that it seems that it was partly adjusted but not fully.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ01-24-2022 03:49 AM
Hello everyone, any news?
Thanks.
@Kaniz Fatmaโ