cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot get past Query Data tutorial for Azure Databricks

philipkd
New Contributor III

I created a new workspace on Azure Databricks, and I can't get past this first step in the tutorial:

 

DROP TABLE IF EXISTS diamonds;

CREATE TABLE diamonds USING CSV OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true")

 

error:

UnityCatalogServiceException: [RequestId=1a282f93-a134-4639-ac88-65798d67c924 ErrorClass=INVALID_PARAMETER_VALUE] GenerateTemporaryPathCredential uri /databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv is not a valid URI. Error message: INVALID_PARAMETER_VALUE: Missing cloud file system scheme.

adding dbfs: didn't fix my issue, as suggested here. instead I got:

AnalysisException: [UC_FILE_SCHEME_FOR_TABLE_CREATION_NOT_SUPPORTED] Creating table in Unity Catalog with file scheme dbfs is not supported. Instead, please create a federated data source connection using the CREATE CONNECTION command for the same table provider, then create a catalog based on the connection with a CREATE FOREIGN CATALOG command to reference the tables therein.

I also browsed the file system (with ls commands) and saw the file there.

Note: I got this working properly on AWS Databricks

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @philipkdIt appears you’ve encountered an issue while creating a table in Azure Databricks using the Unity Catalog.

Let’s address this step by step:

  1. URI Format: The error message indicates that the URI for your CSV file is missing a cloud file system scheme. In Databricks, you need to specify the correct scheme (such as dbfs:/, adl:/, or wasbs:/) to access files. In your case, the path should start with dbfs:/ to indicate Databricks File System (DBFS).

  2. Correcting the Path: Modify your CREATE TABLE statement to include the correct scheme for the file path. Here’s an example using dbfs:/:

    CREATE TABLE diamonds
    USING CSV
    OPTIONS (path "dbfs:/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true");
    
  3. Unity Catalog and File Schemes:

    • Creating tables directly in the Unity Catalog using the CREATE TABLE statement with a file scheme like dbfs:/ is not supported.
    • Instead, consider creating a federated data source connection using the CREATE CONNECTION command for the same table provider (in this case, CSV).
    • Then, create a catalog based on the connection using the CREATE FOREIGN CATALOG command to reference the tables therein.
  4. AWS Databricks vs. Azure Databricks: It’s worth noting that the behavior might differ between AWS Databricks and Azure Databricks. While you got it working on AWS Databricks, Azure Databricks follows a slightly different approach.

  5. Additional Troubleshooting:

    • Ensure that the file exists at the specified path.
    • Verify that you have the necessary permissions to access the file.
    • Double-check the syntax of your SQL statements.

Feel free to adjust the path and try creating the table again.

If you encounter any further issues, don’t hesitate to ask for assistance! 😊

 

dollyb
Contributor

Struggling with this as well. So using dbfs:/ with CREATE TABLE statement works on AWS, but not Azure?

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!