cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Different configurations for same Databricks Runtime version

yoniau
New Contributor II

Hi all,

On my DBR installations, s3a scheme is mapped to shaded.databricks.org.apache.hadoop.fs.s3a.S3AFileSystem. On my customer's DBR installations it is mapped to com.databricks.s3a.S3AFileSystem.

We both use the same DBR runtime, and none of us has configured anything to override this setting.

What is the cause for this difference? And how can I make sure I'm using the right filesystem? How can I make sure in the future no third file system appears and breaks my code again?

1 ACCEPTED SOLUTION

Accepted Solutions

Prabakar
Databricks Employee
Databricks Employee

@Yoni Auโ€‹ , If both of you are using the same DBR version, then you should not find any difference. As @Hubert Dudekโ€‹ mentioned, there might be some spark configuration change made on one of the clusters. Also, it's worth checking for any cluster scope or global init script.

View solution in original post

2 REPLIES 2

Hubert-Dudek
Esteemed Contributor III

@Yoni Auโ€‹ , Databricks Runtime 7.3 LTS and above use the new connector com.databricks.s3a.S3AFileSystem. Are you using 7.3?

Anyway, please verify spark config on both installations (via Cluster -> Spark UI -> Environment) what is there regarding S3AFileSystem? and then set common values for both (via Cluster -> Configuration -> Advanced options)

Prabakar
Databricks Employee
Databricks Employee

@Yoni Auโ€‹ , If both of you are using the same DBR version, then you should not find any difference. As @Hubert Dudekโ€‹ mentioned, there might be some spark configuration change made on one of the clusters. Also, it's worth checking for any cluster scope or global init script.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group