cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to read data from ADLS using databricks serverless sql pool

NarenderKumar
New Contributor II

I have a data bricks workspace and an Azure data lake storage account.

Both are present in the same Vnet.

Unity catalog is enabled in the worksapce.

I have created some tables in unity catalog.

I am able to query the data from the tables when I use the any of the below 2 computes:

1. All purpose cluster

2. SQL Warehouse of type "PRO"

 

When I try to read the same table data using a SQL warehouse of the type "serverless", I get the below error:

This Azure storage request is not authorized. The storage account's 'Firewalls and virtual networks' settings may be blocking access to storage services. Please verify your Azure storage credentials or firewall exception settings.

I know that the access, firewalls and networking are not  a problem because it is working with all purpose cluster and PRO SQL warehouse. So I not know what is the problem here.

Please let me how to fix this.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

but serverless is in Databricks Subscription so you need to whitelist traffic from there.

Please check that list. Storage firewall needs to allow connections https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/serv...

View solution in original post

2 REPLIES 2

Hubert-Dudek
Esteemed Contributor III

but serverless is in Databricks Subscription so you need to whitelist traffic from there.

Please check that list. Storage firewall needs to allow connections https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/serv...

Kaniz
Community Manager
Community Manager

Hi @NarenderKumarConfiguring an ETL framework using Delta Live Tables (DLT) can be powerful, especially when you want to maintain flexibility and avoid hardcoding configurations directly in your notebook.

Let’s explore some options for managing your pipeline settings and configurations:

  1. Delta Live Tables UI Configuration:

    • DLT provides a user-friendly interface for configuring and editing pipeline settings. You can use the UI to define your pipeline settings, including source paths, target schemas, compute settings, and more.
    • The UI also allows you to display and edit settings in JSON format. Most settings can be configured using either the UI or a JSON specification.
    • For advanced options, you can directly edit the JSON configuration in the workspace. JSON configuration files are useful when deploying pipelines to new environments or when using the CLI or REST API.
    • To learn more about the available settings, refer to the Delta Live Tables pipeline configurations documentation.
  2. Product Edition Selection:

    • Choose the appropriate DLT product edition based on your pipeline requirements:
      • Core: Suitable for streaming ingest workloads without advanced features like change data capture (CDC) or expectations.
      • Pro: Supports streaming ingest and CDC workloads.
      • Advanced: Includes Core and Pro features, plus support for expectations and data quality constraints.
  3. Pipeline Source Code:

    • Define your pipeline source code in Databricks Notebooks, SQL scripts, or Python scripts stored in workspace files.
    • You can use the file selector in the DLT UI to configure the source code defining your pipeline.
    • When using notebooks, ensure that your source code is organized and modular. You can create separate notebooks for different transformations and reference them in your pipeline settings.
  4. Parameterization:

    • Consider parameterizing your pipeline settings. Instead of hardcoding values directly, use placeholders or variables that can be dynamically replaced during execution.
    • For example, you can define a source path as a variable and pass its value dynamically to your pipeline.
  5. Error While Converting to Pandas DataFrame:

    • If you encounter errors while converting your configuration to a Pandas DataFrame, ensure that the data format aligns with what Pandas expects.
    • Verify that the configuration data is structured correctly (e.g., a list of dictionaries or a dictionary of lists).
    • If you’re reading the configuration from a Delta table, ensure that the schema matches the expected format.
  6. Pipeline Execution Sequence:

    • Since you have multiple transformations executed in sequence, consider maintaining an ordered list of transformations.
    • You can store this list in a separate Delta table or a structured file (e.g., Parquet) that contains the transformation details (table names, SQL queries, etc.).
    • Read this configuration during pipeline execution and apply transformations one by one.
  7. Iterating Over Rows:

    • If you need to iterate over rows in your configuration table, consider using Databricks Spark APIs directly instead of converting to a Pandas DataFrame.
    • Use Spark DataFrames to read the configuration table and process each row.

Remember that DLT is designed to simplify ETL workflows, and you have the flexibility to choose the most suitable approach for your use case. Feel free to explore the UI, leverage parameterization, and adapt your pipeline settings to external sources as needed.

If you encounter specific issues, feel free to share more details, and I’ll be happy to assist further! 🚀

 
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.