cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

databricks-connect, dbutils, abfss path, URISyntaxException

KrzysztofPrzyso
New Contributor II

When trying to use `dbutils.fs.cp` in the #databricks-connect #databricks-connect context to upload files to Azure Datalake Gen2 I get a malformed URI error

I have used the code provided here:
https://learn.microsoft.com/en-gb/azure/databricks/dev-tools/databricks-connect/python/databricks-ut...

 

from databricks.sdk import WorkspaceClient
w = WorkspaceClient() 
path = r"abfss://bronze@devstorageacc.dfs.core.windows.net/test/"
w.dbutils.fs.cp('dbfs:/config.json', path)

Error:

```databricks.sdk.errors.mapping.InvalidParameterValue: java.net.URISyntaxException: Relative path in absolute URI: abfss:%5Cbronze@devstorageacc.dfs.core.windows.net%5Ctest```

KrzysztofPrzyso_0-1707241094344.png

The standard `dbutils.fs.cp` works on the cluster without problems. I have positively confirmed access rights.

Possibly it is a known issue described here: databricks-connect : Relative path in absolute URI ยท Issue #2883 ยท sparklyr/sparklyr (github.com)

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @KrzysztofPrzysoIt appears that youโ€™re encountering an issue with relative paths in absolute URIs when using dbutils.fs.cp in the context of Databricks Connect to upload files to Azure Data Lake Gen2.

Letโ€™s break down the problem and explore potential solutions.

  1. Error Explanation: The error message you received indicates a malformed URI:

    databricks.sdk.errors.mapping.InvalidParameterValue: java.net.URISyntaxException: Relative path in absolute URI: abfss:%5Cbronze@devstorageacc.dfs.core.windows.net%5Ctest
    
  2. Root Cause: The issue likely stems from the fact that your provided path is considered relative, whereas dbutils.fs.cp expects an absolute path. Letโ€™s delve into the details.

  3. Absolute vs. Relative Paths:

    • An absolute path specifies the complete location of a file or directory from the root directory.
    • A relative path is specified relative to the current working directory.
  4. Solution: To resolve this, ensure that you provide an absolute path when using dbutils.fs.cp. Here are some steps to consider:

    • Check Your Path: Verify that the path youโ€™re passing ('dbfs:/config.json') is indeed an absolute path. If it starts with /, itโ€™s an absolute path; otherwise, itโ€™s relative.

    • Use an Absolute Path Explicitly: Instead of relying on relative paths, specify the full absolute path to the source file. For example:

      w.dbutils.fs.cp('/dbfs/config.json', path)
      
    • URL Encoding: If your path contains special characters (such as colons), ensure proper URL encoding. However, in your case, it seems the issue is not related to special characters.

    • Documentation Reference: Refer to the Databricks documentation for more details on working with DBFS paths.

  5. Known Issue: You mentioned a known issue related to relative paths in absolute URIs. If you suspect this is the case, consider checking the GitHub issue you referenced: databricks-connect : Relative path in absolute URI.

Remember that Databricks Connect runs in the cloud, so local file paths wonโ€™t work directly. You need to upload files to the Databricks file system. If you encounter further issues, explore the documentation for additional insights. ๐Ÿš€

 
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.