cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Using dbutils.fs.ls on URI with square brackets results in error

Josh_Stafford
New Contributor II

Square brackets in ADLS are accepted, so why can't I list the files in the folder? I have tried escaping the square brackets manually, but then the escaped values are re-escaped from %5B to %255B and %5D to %255D.

I get:

URISyntaxException: Illegal character in path at index 188: abfss://container-name@***[***]***/

Actual path replaced by ***.

2 REPLIES 2

Anonymous
Not applicable

@Joshua Stafford​ :

The URISyntaxException error you are encountering is likely due to the fact that square brackets are reserved characters in URIs (Uniform Resource Identifiers) and need to be properly encoded when used in a URL. In this case, it appears that the square brackets in the URI are not being encoded correctly, causing the error.

To resolve this issue, you can try encoding the square brackets manually using the appropriate percent encoding format. For example, you can replace "[" with "%5B" and "]" with "%5D" in the URI. However, it seems that you have already tried this approach and encountered further issues with the double encoding of "%".

In this case, you can try using the unquote() function from the urllib.parse module in Python to decode the URI before passing it to dbutils.fs.ls(). Here's an example:

python

from urllib.parse import unquote
 
# Example URI with encoded square brackets
uri = "abfss://container-name@***%5B***%5D***"
 
# Decode the URI
decoded_uri = unquote(uri)
 
# Use decoded URI in dbutils.fs.ls()
dbutils.fs.ls(decoded_uri)

This should properly decode the URI and allow you to list the files in the folder using dbutils.fs.ls() without encountering the URISyntaxException error. Note that you may need to adjust the encoding format depending on the specific requirements of your ADLS (Azure Data Lake Storage) environment.

Hi Suteja, this looks like it might work, however I am using Scala. Is there an equivalent function in scala that can perform this function? Or is the source code anywhere so I can translate it?

edit:

I take that back--I attempted to use this in python and it gives the same error as if i did not "unquote" the uri. the [ and ] are left untouched when I print out the unquoted uri.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.