cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Using dbutils.fs.ls on URI with square brackets results in error

Josh_Stafford
New Contributor II

Square brackets in ADLS are accepted, so why can't I list the files in the folder? I have tried escaping the square brackets manually, but then the escaped values are re-escaped from %5B to %255B and %5D to %255D.

I get:

URISyntaxException: Illegal character in path at index 188: abfss://container-name@***[***]***/

Actual path replaced by ***.

2 REPLIES 2

Anonymous
Not applicable

@Joshua Stafford​ :

The URISyntaxException error you are encountering is likely due to the fact that square brackets are reserved characters in URIs (Uniform Resource Identifiers) and need to be properly encoded when used in a URL. In this case, it appears that the square brackets in the URI are not being encoded correctly, causing the error.

To resolve this issue, you can try encoding the square brackets manually using the appropriate percent encoding format. For example, you can replace "[" with "%5B" and "]" with "%5D" in the URI. However, it seems that you have already tried this approach and encountered further issues with the double encoding of "%".

In this case, you can try using the unquote() function from the urllib.parse module in Python to decode the URI before passing it to dbutils.fs.ls(). Here's an example:

python

from urllib.parse import unquote
 
# Example URI with encoded square brackets
uri = "abfss://container-name@***%5B***%5D***"
 
# Decode the URI
decoded_uri = unquote(uri)
 
# Use decoded URI in dbutils.fs.ls()
dbutils.fs.ls(decoded_uri)

This should properly decode the URI and allow you to list the files in the folder using dbutils.fs.ls() without encountering the URISyntaxException error. Note that you may need to adjust the encoding format depending on the specific requirements of your ADLS (Azure Data Lake Storage) environment.

Hi Suteja, this looks like it might work, however I am using Scala. Is there an equivalent function in scala that can perform this function? Or is the source code anywhere so I can translate it?

edit:

I take that back--I attempted to use this in python and it gives the same error as if i did not "unquote" the uri. the [ and ] are left untouched when I print out the unquoted uri.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group