cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot get past Query Data tutorial for Azure Databricks

philipkd
New Contributor III

I created a new workspace on Azure Databricks, and I can't get past this first step in the tutorial:

 

DROP TABLE IF EXISTS diamonds;

CREATE TABLE diamonds USING CSV OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true")

 

error:

UnityCatalogServiceException: [RequestId=1a282f93-a134-4639-ac88-65798d67c924 ErrorClass=INVALID_PARAMETER_VALUE] GenerateTemporaryPathCredential uri /databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv is not a valid URI. Error message: INVALID_PARAMETER_VALUE: Missing cloud file system scheme.

adding dbfs: didn't fix my issue, as suggested here. instead I got:

AnalysisException: [UC_FILE_SCHEME_FOR_TABLE_CREATION_NOT_SUPPORTED] Creating table in Unity Catalog with file scheme dbfs is not supported. Instead, please create a federated data source connection using the CREATE CONNECTION command for the same table provider, then create a catalog based on the connection with a CREATE FOREIGN CATALOG command to reference the tables therein.

I also browsed the file system (with ls commands) and saw the file there.

Note: I got this working properly on AWS Databricks

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @philipkdIt appears you’ve encountered an issue while creating a table in Azure Databricks using the Unity Catalog.

Let’s address this step by step:

  1. URI Format: The error message indicates that the URI for your CSV file is missing a cloud file system scheme. In Databricks, you need to specify the correct scheme (such as dbfs:/, adl:/, or wasbs:/) to access files. In your case, the path should start with dbfs:/ to indicate Databricks File System (DBFS).

  2. Correcting the Path: Modify your CREATE TABLE statement to include the correct scheme for the file path. Here’s an example using dbfs:/:

    CREATE TABLE diamonds
    USING CSV
    OPTIONS (path "dbfs:/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true");
    
  3. Unity Catalog and File Schemes:

    • Creating tables directly in the Unity Catalog using the CREATE TABLE statement with a file scheme like dbfs:/ is not supported.
    • Instead, consider creating a federated data source connection using the CREATE CONNECTION command for the same table provider (in this case, CSV).
    • Then, create a catalog based on the connection using the CREATE FOREIGN CATALOG command to reference the tables therein.
  4. AWS Databricks vs. Azure Databricks: It’s worth noting that the behavior might differ between AWS Databricks and Azure Databricks. While you got it working on AWS Databricks, Azure Databricks follows a slightly different approach.

  5. Additional Troubleshooting:

    • Ensure that the file exists at the specified path.
    • Verify that you have the necessary permissions to access the file.
    • Double-check the syntax of your SQL statements.

Feel free to adjust the path and try creating the table again.

If you encounter any further issues, don’t hesitate to ask for assistance! 😊

 
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.