cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

"with open" not working with Shared Access Cluster on mounted location

mathijs-fish
New Contributor III

Hi All,

For an application that we are building, we need a encoding detector/utf-8 enforcer. For this, we used the python library chardet in combination with "with open". We open a file from a mounted adls location (we use a legacy hive-metastore)

When we were using Non Isolation Shared clusters, it was working fine, but because of security reasons, we have to change to Shared Access clusters. However, now the encoding detector is not working anymore.

This is how we detected encoding before:

mathijsfish_1-1701785425743.png

Error using shared access cluster:

mathijsfish_2-1701785466668.png

After some investigation we concluded that using with open, but also the os and glob module on mounted locations with a shared access cluster, does not work properly. Any idea how we can fix this?

For your reference, we have to use this mounted location, and a shared access cluster.

 

1 ACCEPTED SOLUTION

Accepted Solutions

mathijs-fish
New Contributor III

@Ayushi_SutharThanks! However, this does not solve the issue; because we have to use shared clusters. In the meantime we found a way of detecting the encoding on shared clusters in the following way:

rawdata = (
        spark.read.format("binaryFile")
        .load(file_path)
        .selectExpr("SUBSTR(content, 0, 500000) AS content")
        .collect()[0]
        .content
    )
    encoding = chardet.detect(rawdata)["encoding"]
    print(encoding)

View solution in original post

3 REPLIES 3

Ayushi_Suthar
Honored Contributor
Honored Contributor

Hi @mathijs-fish,I completely understand your hesitation and appreciate your approach to seeking guidance!

I see you are trying to access the external files from dbfs mount location.
As you can see in the snapshots which you have shared, The reason behind the below error while trying to access the external dbfs mount file using "with open" is that you are using a shared access mode cluster.

@FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/art/inbound.
Ltest/EUT_Alignment_20230630_20230712130221.csv'

This is a known limitation for Shared Clusters, where /dbfs path is not accessible. You can try using a single-user cluster instead to access /dbfs which supports UC.

Please refer:
https://docs.databricks.com/clusters/configure.html#shared-access-mode-limitations
https://docs.databricks.com/en/dbfs/unity-catalog.html#how-does-dbfs-work-in-shared-access-mode

And we also have a preview feature 'Improved Shared Clusters' that addresses some of the limitations of Shared Clusters.

Leave a like if this helps, followups are appreciated.

Kudos,

Ayushi

mathijs-fish
New Contributor III

@Ayushi_SutharThanks! However, this does not solve the issue; because we have to use shared clusters. In the meantime we found a way of detecting the encoding on shared clusters in the following way:

rawdata = (
        spark.read.format("binaryFile")
        .load(file_path)
        .selectExpr("SUBSTR(content, 0, 500000) AS content")
        .collect()[0]
        .content
    )
    encoding = chardet.detect(rawdata)["encoding"]
    print(encoding)

Ayushi_Suthar
Honored Contributor
Honored Contributor

Hi @mathijs-fish thank you for sharing the solution, it will help us to update our records and documentation, enabling us to assist other customers more effectively in similar cases.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!